Life of the Data Bit: How Data Informs Education Policy

Here at Washington STEM, we rely on data that are publicly available. But how do we know they are reliable? In this blog, we’ll look how we source and validate the data used in our reports and dashboards.

 

Data is essential. We use it to set goals, track progress and identify systemic inequities. You might think it is primarily found in spreadsheets, but we process data constantly in our daily life: What will you wear tomorrow? Better check the weather forecast. What time will you leave for work tomorrow? Depends on the traffic reports.

A good education helps us hone our instincts on whether a data source is trustworthy, such as an academic journal that is peer-reviewed, or a newspaper that follows journalism codes and ethics. In recent years, a mistrust in government and science has increased—often due to deliberate misinformation or lack of understanding of how scientific findings are validated.

In recent years, a mistrust in government and science has increased—often due to deliberate misinformation or lack of understanding of how scientific research is conducted and findings validated through the peer review process.

Here at Washington STEM, we rely on data that are publicly available. But how do we know they are reliable? In this blog, we’ll look how we source and validate the data used in our reports and dashboards.

Let’s start with “Consuela”, a hypothetical employer in Spokane…

It starts with a phone call

The phone rings and Consuela notices the (202) area code from Washington, D.C.

“It must be the BLS survey,” she thinks, referring to the Bureau of Labor Statistics.

Consuela owns a construction company in Spokane. Each month, she, and tens of thousands of employers like her, provide data on employment, productivity, technology use and other topics through automated phone surveys (Computer Assisted Telephone Interviewing or CATI). In the world of data collection, Consuela is known as a data administrator because she compiles and submits data and works with analysts at the requesting agency to confirm accuracy.

Consuela opens her spreadsheet where she tracks new hires. She reaches for the ringing phone. A bit* of data is about to be born.

*a portmanteau (blending of words) short for “binary digit”

How data is sourced

Millions of data bits from employers and other survey responders feed into databases managed by federal agencies like the U.S. Census Bureau and the U.S. Bureau of Labor Statistics, as well as state agencies like the Employment Security Department and the Department of Commerce, among others. Each of these agencies have teams of data analysts who collect data, clean errors (such as empty cells or incorrectly formatted dates), disaggregate it, that is, separate it into component parts, and anonymize it. This last step removes any identifying information, like names or addresses, so an individual’s data privacy is secured.

Washington STEM uses open source (that is, publicly available) data sets, from a variety of state and federal sources in our data dashboards and tools. Our data tools provide the latest research in early care and education, K-12 education, and career pathways for the general public, including legislators, educators, employers, community-based organizations, so they can understand where they are, forecast future needs and ensure the education-to-workforce pipeline is strong.

Our data tools provide the latest research in early care and education, K-12 education, and career pathways for the general public, including legislators, educators, employers, community-based organizations, so they can understand where they are, forecast future needs and ensure the education-to-workforce pipeline is strong.

Education data in Washington

But when it comes to reporting education outcomes—the backbone of our STEM by the Numbers dashboard—we rely on data from the Education Research Data Center (ERDC), housed in the Office of Financial Management. The legislature created the ERDC in 2007 to collect and manage Washington’s education data from pre-kindergarten to college/workforce, a longitudinal data set known as “P20W”. Fourteen state agencies collect this data, including the Office of the Superintendent of Public Instruction (OSPI), Department of Children Youth and Families (DCYF), Department of Health and Social Services, the State Board Community & Technical Colleges, and others.

Data administrators at each of these agencies, just like Consuela, are responsible for compiling data from their programs, such as student enrollment and demographics, kindergarten math-readiness scores, and graduation rates. The administrator then uploads the data to the ERDC portal where it undergoes quality checks before being added to a master database.

In May of 2007, Governor Christine Gregoire created the P-20 Council to track student progress and transitions from preschool to college. That same year, the legislature passed a bill to create the Education Research Data Center (ERDC) which underwent a study of their processes and procedures in 2023. Washington STEM conducted a parallel review on the needs of data intermediaries. Most said they needed support in order to more effectively engage with the data being collected.

“We receive data from a lot of different data sources, then have to link it up in our data warehouse. As a result, we are always doing validation and quality checks,” said Bonnie Nelson, Senior Data Governance Specialist at ERDC.

Nelson said what makes ERDC unique in Washington is that it houses a “cross sector longitudinal data warehouse”—meaning it links multiple records from one individual student. “Each student generates a record when they go school, college, and later when they get a job. ERDC puts it all in one record.”

From there, the data is fed into ERDC’s publications, including reports on Early Childhood Education, Student Outcomes, and others. Nelson said the ERDC’s primary users are state legislators, policy makers, state agencies, university researchers, and community-based organizations. ERDC is mandated by law to make data available for the public either through online dashboards or by request.

“It is our charge to be the stewards and connectors—it’s not to keep people out of the data, but to tell them, ‘We have something you might find interesting’ and help them access the data to improve student outcomes and experiences.”

This past year, Washington STEM and network partners reached out to 739 data-users across the state, including practitioners, educators, researchers, policymakers, and community leaders and advocates, to ask if and how they use data and what challenges they faced doing so. The results show that 90% use data in their decision-making and planning, but less then 20 of the 739 data users said they felt knowledgeable about the state’s P20W data infrastructure or knew which agency to contact for their data questions. In order to improve data capacity, over the next four years Washington STEM will provide professional development and technical assistance to improve these partners’ capacity to engage with the data they use.

high school students during class break crowd the halls
The High School to Postsecondary project helped schools access and analyze course-taking data. The results showed gender and ethnic disparities in course enrollment: Latino males were less likely to enroll in dual credit and continue onto postsecondary education. Photo credit: Jenny Jimenez

The stories data can tell

At Washington STEM, we don’t just collect data and create dashboards for fun. (Although visualizing data is fun—just ask our data scientist.) As stated at the start, data is important in setting goals, measuring progress, and identifying systemic problems.

For instance, five years ago, a career and college readiness coordinator at a Yakima high school had a hunch that student enrollment in dual credit programs at his school—often linked to increased likelihood of continuing into higher education—was not equitable, but he didn’t have the data to prove it.

So he reached out to Washington STEM for help in accessing and analyzing course-taking data. The results showed gender and ethnic disparities: Latino males were less likely to enroll in dual credit and continue onto postsecondary education.

The Child Care Need and Supply data dashboard showed that out of all 37 counties in Washington, only two have adequate child care supply to meet the need.

Once the school administrators knew their data, they were able to make major improvements to help more students access dual credit programs. In 2022, lawmakers passed a bill requiring all schools to report student demographics in dual credit enrollment. Washington STEM continues to expand this program through the High School to Postsecondary Collaborative, with 40+ schools across the state who are starting to use data dashboards to see their own data—and make changes at the school level.

Similarly, before the Fair Start for Kids Act was passed in 2021, data about child care need and supply were not readily available to the public. Min Hwangbo, Washington STEM Director of Impact, said, “The new law mandated more data transparency. As a result, the Department of Children, Youth, and Families partnered with Washington STEM to create five Early Learning dashboards providing a wide view of the industry.”

“Overall, there is a lack of consistent and accurate data on several key populations: children with disabilities, children experiencing homelessness, and Native American children.”

-Min Hwangbo, Washington STEM Impact Director

Although the Early Learning dashboards and the State of the Children data dashboard and regional reports have increased data availability, it has not done so for all children.

“There is a lack of consistent and accurate reporting of data for several key populations: children with disabilities, children experiencing homelessness, and Native American children,” said Hwangbo. He said this is because some child care industry data collection was voluntary, and during the pandemic it simply didn’t happen in some regions of the state. During the State of the Children co-design process, Washington STEM looked at the data sets with members from each of these communities and many of them said the numbers felt like an undercount.

Calls for early learning data clearinghouse

Although agencies like ERDC, DCYF and OSPI collect some data on preschoolers, currently there is no central clearninghouse for comprehensive, population-level data on early learning. Hwangbo said, “The current data infrastructure across various programs and organizations makes it hard for families to access the support they need, and hard for administrators to use data to improve support for children and families.”

Washington STEM recommends creating a statewide data clearinghouse to improve data access so everyone—legislators, educators, researchers, parents—can have what they need to plan and improve our system of early care and education.

Washington STEM recommends creating a statewide data clearinghouse to improve data access so everyone—legislators, educators, researchers, parents—can have what they need to plan and improve our system of early care and education.

So, whether you are a data-nerd, or dipping your toe in the data world for the first time—we invite you to use Washington STEM’s Data Tools. And next time you hear the economic reports on the morning news, think of Consuela and the other data administrators who stand behind those numbers.

 
 

“Which Washington STEM data tool should I use?”

 

 
Key
BLS — U.S. Bureau of Labor Statistics
Census — U.S. Census Bureau
CCA — Child Care Aware
COMMS — Washington State Department of Commerce
DCFY — Washington State Department of Children, Youth, and Families
ECEAP — Early Childhood Education Assistance Program
ERDC — Washington State Employment Security Department
OFM — Office of Financial Management
OSPI — Office of Superintendent of Public Instruction