IES Blog

Institute of Education Sciences

From Data Collection to Data Release: What Happens?

In today’s world, much scientific data is collected automatically from sensors and processed by computers in real time to produce instant analytic results. People grow accustomed to instant data and expect to get things quickly.

At the National Center for Education Statistics (NCES), we are frequently asked why, in a world of instant data, it takes so long to produce and publish data from surveys. Although improvements in the timeliness of federal data releases have been made, there are fundamental differences in the nature of data compiled by automated systems and specific data requested from federal survey respondents. Federal statistical surveys are designed to capture policy-related and research data from a range of targeted respondents across the country, who may not always be willing participants.

This blog is designed to provide a brief overview of the survey data processing framework, but it’s important to understand that the survey design phase is, in itself, a highly complex and technical process. In contrast to a management information system, in which an organization has complete control over data production processes, federal education surveys are designed to represent the entire country and require coordination with other federal, state, and local agencies. After the necessary coordination activities have been concluded, and the response periods for surveys have ended, much work remains to be done before the survey data can be released.

Survey Response

One of the first sources of potential delays is that some jurisdictions or individuals are unable to fill in their surveys on time. Unlike opinion polls and online quizzes, which use anyone who feels like responding to the survey (convenience samples), NCES surveys use rigorously formulated samples meant to properly represent specific populations, such as states or the nation as a whole. In order to ensure proper representation within the sample, NCES follows up with nonresponding sampled individuals, education institutions, school districts, and states to ensure the maximum possible survey participation within the sample. Some large jurisdictions, such as the New York City school district, also have their own extensive survey operations to conclude before they can provide information to NCES. Before the New York City school district, which is larger than about two-thirds of all state education systems, can respond to NCES surveys, it must first gather information from all its schools. Receipt of data from New York City and other large districts is essential to compiling nationally representative data.

Editing and Quality Reviews

Waiting for final survey responses does not mean that survey processing comes to a halt. One of the most important roles NCES plays in survey operations is editing and conducting quality reviews of incoming data, which take place on an ongoing basis. In these quality reviews, a variety of strategies are used to make cost-effective and time-sensitive edits to the incoming data. For example, in the Integrated Postsecondary Education Data System (IPEDS), individual higher education institutions upload their survey responses and receive real-time feedback on responses that are out of range compared to prior submissions or instances where survey responses do not align in a logical way. All NCES surveys use similar logic checks in addition to a range of other editing checks that are appropriate to the specific survey. These checks typically look for responses that are out of range for a certain type of respondent.

Although most checks are automated, some particularly complicated or large responses may require individual review. For IPEDS, the real-time feedback described above is followed by quality review checks that are done after collection of the full dataset. This can result in individualized follow up and review with institutions whose data still raise substantive questions. 

Sample Weighting

In order to lessen the burden on the public and reduce costs, NCES collects data from selected samples of the population rather than taking a full census of the entire population for every study. In all sample surveys, a range of additional analytic tasks must be completed before data can be released. One of the more complicated tasks is constructing weights based on the original sample design and survey responses so that the collected data can properly represent the nation and/or states, depending on the survey. These sample weights are designed so that analyses can be conducted across a range of demographic or geographic characteristics and properly reflect the experiences of individuals with those characteristics in the population.

If the survey response rate is too low, a “survey bias analysis” must be completed to ensure that the results will be sufficiently reliable for public use. For longitudinal surveys, such as the Early Childhood Longitudinal Study, multiple sets of weights must be constructed so that researchers using the data will be able to appropriately account for respondents who answered some but not all of the survey waves.

NCES surveys also include “constructed variables” to facilitate more convenient and systematic use of the survey data. Examples of constructed variables include socioeconomic status or family type. Other types of survey data also require special analytic considerations before they can be released. Student assessment data, such as the National Assessment of Educational Progress (NAEP), require that a number of highly complex processes be completed to ensure proper estimations for the various populations being represented in the results. For example, just the standardized scoring of multiple choice and open-ended items can take thousands of hours of design and analysis work.

Privacy Protection

Release of data by NCES carries a legal requirement to protect the privacy of our nation’s children. Each NCES public-use dataset undergoes a thorough evaluation to ensure that it cannot be used to identify responses of individuals, whether they are students, parents, teachers, or principals. The datasets must be protected through item suppression, statistical swapping, or other techniques to ensure that multiple datasets cannot be combined in such a way as to identify any individual. This is a time-consuming process, but it is incredibly important to protect the privacy of respondents.

Data and Report Release

When the final data have been received and edited, the necessary variables have been constructed, and the privacy protections have been implemented, there is still more that must be done to release the data. The data must be put in appropriate formats with the necessary documentation for data users. NCES reports with basic analyses or tabulations of the data must be prepared. These products are independently reviewed within the NCES Chief Statistician’s office.

Depending on the nature of the report, the Institute of Education Sciences Standards and Review Office may conduct an additional review. After all internal reviews have been conducted, revisions have been made, and the final survey products have been approved, the U.S. Secretary of Education’s office is notified 2 weeks in advance of the pending release. During this notification period, appropriate press release materials and social media announcements are finalized.

Although NCES can expedite some product releases, the work of preparing survey data for release often takes a year or more. NCES strives to maintain a balance between timeliness and providing the reliable high-quality information that is expected of a federal statistical agency while also protecting the privacy of our respondents.  

 

By Thomas Snyder

A New Guide to Education Data Privacy

By The National Forum on Education Statistics Education Data Privacy Working Group

The expanding use of data and new technologies for classroom instruction hold promise for facilitating learning and better personalizing education for students. However, these changes also heighten the responsibility of schools and education agencies to protect student privacy. The recently released Forum Guide to Education Data Privacy offers recommendations on how to do this.

 Privacy is one of the most important issues in education data policy today. Many states have passed laws that require education agencies to implement strong privacy programs and procedures. State and local education agencies (SEAs and LEAs) are responding to the growing demands for privacy protection, as well as expectations for transparency in how student data are collected, used, and protected. Local and state members of the National Forum on Education Statistics (the Forum) identified a particular need for a resource that would assist SEAs and LEAs in working with school staff to ensure that student data are properly protected. The Forum established an Education Data Privacy Working Group tasked with developing a resource to help education agencies support school staff in responsibly using and sharing student data for instructional and administrative purposes, as well as strengthen agency privacy programs and related professional development efforts. The Forum Guide to Education Data Privacy was released in early July.

Chapter 1 of the guide includes information on

  • federal and state privacy laws;
  • the interrelationships among data governance, data security, and data privacy;
  • roles and responsibilities for protecting privacy at various agency levels; and
  • effective professional development on data privacy and security.

Chapter 2 includes 11 case studies designed to highlight common privacy issues related to the use of student data and presents basic approaches to managing those issues. Topics include

  • using online apps in the classroom;
  • responding to parent and PTA requests for student contact information;
  • using and sharing student data within a school;
  • sharing data among community schools and community-based organizations;
  • using data in presentations and training materials; and
  • using social media.

Each case study includes a scenario that exemplifies the privacy risk, and offers various approaches and action steps that agencies can take to minimize the risk. The information presented in the case studies is based largely on the collective experience of members of the Forum.

The working group collaborated with the U.S. Department of Education’s Privacy Technical Assistance Center (PTAC) in the development of the guide. Links to free, helpful PTAC resources are highlighted throughout. 

It is important for education agencies to understand that there is no “one-size-fits-all” approach to protecting privacy. Each agency needs to consider relevant state and federal laws, state and local school board policies, parental expectations, student instructional needs, and the agency’s available resources when developing privacy guidelines and procedures. It is our hope that the Forum Guide to Education Data Privacy will help agencies develop privacy programs and procedures that fit their particular circumstances.    

 

About the National Forum on Education Statistics

The work of the National Forum on Education Statistics is a key aspect of the National Cooperative Education Statistics System. The Cooperative System was established to produce and maintain, with the cooperation of the states, comparable and uniform education information and data that are useful for policymaking at the federal, state, and local levels. To assist in meeting this goal, the National Center for Education Statistics (NCES), within the Institute of Education Sciences (IES) of the U.S. Department of Education, established the Forum to improve the collection, reporting, and use of elementary and secondary education statistics. The Forum addresses issues in education data policy, sponsors innovations in data collection and reporting, and provides technical assistance to improve state and local data systems.

Members of the Forum establish working groups to develop best practice guides in data-related areas of interest to federal, state, and local education agencies. They are assisted in this work by NCES, but the content comes from the collective experience of working group members who review all products iteratively throughout the development process. After the working group completes the content and reviews a document a final time, publications are subject to examination by members of the Forum standing committee that sponsors the project. Finally, Forum members (approximately 120 people) review and formally vote to approve all documents prior to publication. NCES provides final review and approval prior to online publication.

The information and opinions published in Forum products do not necessarily represent the policies or views of the U.S. Department of Education, IES, or NCES. For more information about the Forum, please visit nces.ed.gov/forum or contact Ghedam Bairu at Ghedam.bairu@ed.gov.