IES Blog

Institute of Education Sciences

Building Bridges: Increasing the Power of the Civil Rights Data Collection (CRDC) Through Data Linking With an ID Crosswalk

On October 15, 2020, the U.S. Department of Education’s (ED) Office for Civil Rights (OCR) released the 2017–18 Civil Rights Data Collection (CRDC). The CRDC is a biennial survey that has been conducted by ED to collect data on key education and civil rights issues in our nation’s public schools since 1968. The CRDC provides data on student enrollment and educational programs and services, most of which are disaggregated by students’ race/ethnicity, sex, limited English proficiency designation, and disability status. The CRDC is an important aspect of the overall strategy of ED’s Office for Civil Rights (OCR) to administer and enforce civil rights statutes that apply to U.S. public schools. The information collected through the CRDC is also used by other ED offices as well as by policymakers and researchers outside of ED.  

As a standalone data collection, the CRDC provides a wealth of information. However, the analytic power and scope of the CRDC can be enhanced by linking it to other ED and government data collections, including the following:

A Crosswalk to Link CRDC Data to Other Data Collections

To facilitate joining CRDC data to these and other data collections, NCES developed an ID crosswalk. This crosswalk is necessary because there are instances when the CRDC school ID number (referred to as a combo key) does not match the NCES school ID number assigned in other data collections (see the “Mismatches Between ID Numbers” section below for reasons why this may occur). By linking the CRDC to other data collections, researchers can answer questions that CRDC data alone cannot, such as the following:



Mismatches Between ID Numbers

Mismatches between CRDC combo key numbers and NCES ID numbers may occur because of differences in how schools and districts are reported in the CRDC and other collections and because of differences in the timing of collections. Below are some examples.

  • Differences in how schools and school districts are reported in the CRDC and other data collections:
    • New York City Public Schools is reported as a single district in the CRDC but as multiple districts (with one supervisory union and 33 components of the supervisory union) in other data collections. Thus, the district will have one combo key in the CRDC but multiple ID numbers in other data collections.
    • Sometimes charter schools are reported differently in the CRDC compared with other data collections. For example, some charter schools in California are reported as independent (with each school serving as its own school district) in the CRDC but as a single combined school district in other data collections. Thus, each school will have its own combo key in the CRDC, but there will be one ID number for the combined district in other data collections.
    • There are differences between how a state or school district defines a school compared with how other data collections define a school.
  • Differences in the timing of the CRDC and other data collections:
    • There is a lag between when the CRDC survey universe is planned and when the data collection begins. During this time, a new school may open. Since the school has not yet been assigned an ID number, it is reported in the CRDC as a new school.


Interested in using the ID crosswalk to link CRDC data with other data collections and explore a research question of your own? Visit https://www.air.org/project/research-evaluation-support-civil-rights-data-collection-crdc to learn more and access the crosswalk. For more information about the CRDC, visit https://ocrdata.ed.gov/.

 

By Jennifer Sable, AIR, and Stephanie R. Miller, NCES

From Data Collection to Data Release: What Happens?

In today’s world, much scientific data is collected automatically from sensors and processed by computers in real time to produce instant analytic results. People grow accustomed to instant data and expect to get things quickly.

At the National Center for Education Statistics (NCES), we are frequently asked why, in a world of instant data, it takes so long to produce and publish data from surveys. Although improvements in the timeliness of federal data releases have been made, there are fundamental differences in the nature of data compiled by automated systems and specific data requested from federal survey respondents. Federal statistical surveys are designed to capture policy-related and research data from a range of targeted respondents across the country, who may not always be willing participants.

This blog is designed to provide a brief overview of the survey data processing framework, but it’s important to understand that the survey design phase is, in itself, a highly complex and technical process. In contrast to a management information system, in which an organization has complete control over data production processes, federal education surveys are designed to represent the entire country and require coordination with other federal, state, and local agencies. After the necessary coordination activities have been concluded, and the response periods for surveys have ended, much work remains to be done before the survey data can be released.

Survey Response

One of the first sources of potential delays is that some jurisdictions or individuals are unable to fill in their surveys on time. Unlike opinion polls and online quizzes, which use anyone who feels like responding to the survey (convenience samples), NCES surveys use rigorously formulated samples meant to properly represent specific populations, such as states or the nation as a whole. In order to ensure proper representation within the sample, NCES follows up with nonresponding sampled individuals, education institutions, school districts, and states to ensure the maximum possible survey participation within the sample. Some large jurisdictions, such as the New York City school district, also have their own extensive survey operations to conclude before they can provide information to NCES. Before the New York City school district, which is larger than about two-thirds of all state education systems, can respond to NCES surveys, it must first gather information from all its schools. Receipt of data from New York City and other large districts is essential to compiling nationally representative data.

Editing and Quality Reviews

Waiting for final survey responses does not mean that survey processing comes to a halt. One of the most important roles NCES plays in survey operations is editing and conducting quality reviews of incoming data, which take place on an ongoing basis. In these quality reviews, a variety of strategies are used to make cost-effective and time-sensitive edits to the incoming data. For example, in the Integrated Postsecondary Education Data System (IPEDS), individual higher education institutions upload their survey responses and receive real-time feedback on responses that are out of range compared to prior submissions or instances where survey responses do not align in a logical way. All NCES surveys use similar logic checks in addition to a range of other editing checks that are appropriate to the specific survey. These checks typically look for responses that are out of range for a certain type of respondent.

Although most checks are automated, some particularly complicated or large responses may require individual review. For IPEDS, the real-time feedback described above is followed by quality review checks that are done after collection of the full dataset. This can result in individualized follow up and review with institutions whose data still raise substantive questions. 

Sample Weighting

In order to lessen the burden on the public and reduce costs, NCES collects data from selected samples of the population rather than taking a full census of the entire population for every study. In all sample surveys, a range of additional analytic tasks must be completed before data can be released. One of the more complicated tasks is constructing weights based on the original sample design and survey responses so that the collected data can properly represent the nation and/or states, depending on the survey. These sample weights are designed so that analyses can be conducted across a range of demographic or geographic characteristics and properly reflect the experiences of individuals with those characteristics in the population.

If the survey response rate is too low, a “survey bias analysis” must be completed to ensure that the results will be sufficiently reliable for public use. For longitudinal surveys, such as the Early Childhood Longitudinal Study, multiple sets of weights must be constructed so that researchers using the data will be able to appropriately account for respondents who answered some but not all of the survey waves.

NCES surveys also include “constructed variables” to facilitate more convenient and systematic use of the survey data. Examples of constructed variables include socioeconomic status or family type. Other types of survey data also require special analytic considerations before they can be released. Student assessment data, such as the National Assessment of Educational Progress (NAEP), require that a number of highly complex processes be completed to ensure proper estimations for the various populations being represented in the results. For example, just the standardized scoring of multiple choice and open-ended items can take thousands of hours of design and analysis work.

Privacy Protection

Release of data by NCES carries a legal requirement to protect the privacy of our nation’s children. Each NCES public-use dataset undergoes a thorough evaluation to ensure that it cannot be used to identify responses of individuals, whether they are students, parents, teachers, or principals. The datasets must be protected through item suppression, statistical swapping, or other techniques to ensure that multiple datasets cannot be combined in such a way as to identify any individual. This is a time-consuming process, but it is incredibly important to protect the privacy of respondents.

Data and Report Release

When the final data have been received and edited, the necessary variables have been constructed, and the privacy protections have been implemented, there is still more that must be done to release the data. The data must be put in appropriate formats with the necessary documentation for data users. NCES reports with basic analyses or tabulations of the data must be prepared. These products are independently reviewed within the NCES Chief Statistician’s office.

Depending on the nature of the report, the Institute of Education Sciences Standards and Review Office may conduct an additional review. After all internal reviews have been conducted, revisions have been made, and the final survey products have been approved, the U.S. Secretary of Education’s office is notified 2 weeks in advance of the pending release. During this notification period, appropriate press release materials and social media announcements are finalized.

Although NCES can expedite some product releases, the work of preparing survey data for release often takes a year or more. NCES strives to maintain a balance between timeliness and providing the reliable high-quality information that is expected of a federal statistical agency while also protecting the privacy of our respondents.  

 

By Thomas Snyder

Data Tools for College Professors and Students

Ever wonder what parts of the country produce the most English majors? Want to know which school districts have the most guidance counselors? The National Center for Education Statistics (NCES) has all the tools you need to dig into these and lots of other data!

Whether you’re a student embarking on a research project or a college professor looking for a large data set to use for an assignment, NCES has you covered. Below, check out the tools you can use to conduct searches, download datasets, and generate your own statistical tables and analyses.

 

Conduct Publication Searches

Two search tools help researchers identify potential data sources for their study and explore prior research conducted with NCES data. The Publications & Products Search Tool can be used to search for NCES publications and data products. The Bibliography Search Tool, which is updated continually, allows users to search for individual citations from journal articles that have been published using data from most surveys conducted by NCES.

Key reference publications include the Digest of Education Statistics, which is a comprehensive library of statistical tabulations, and The Condition of Education, which highlights up-to-date trends in education through statistical indicators.

 

Learn with Instructional Modules

The Distance Learning Dataset Training System (DLDT) is an interactive online tool that allows users to learn about NCES data across the education spectrum. DLDT’s computer-based training introduces users to many NCES datasets, explains their designs, and offers technical considerations to facilitate successful analyses. Please see the NCES blog Learning to Use the Data: Online Dataset Training Modules for more details about the DLDT tool.
 




Download and Access Raw Data Files

Users have several options for conducting statistical analyses and producing data tables. Many NCES surveys release public-use raw data files that professors and students can download and analyze using statistical software packages like SAS, STATA, and SPSS. Some data files and syntax files can also be downloaded using NCES data tools:

  • Education Data Analysis Tool (EDAT) and the Online Codebook allow users to download several survey datasets in various statistical software formats. Users can subset a dataset by selecting a survey, a population, and variables relevant to their analysis.
  • Many data files can be accessed directly from the Surveys & Programs page by clicking on the specific survey and then clicking on the “Data Products” link on the survey website.

 

Generate Analyses and Tables

NCES provides several online analysis tools that do not require a statistical software package:

  • DataLab is a tool for making tables and regressions that features more than 30 federal education datasets. It includes three powerful analytic tools:
    • QuickStats—for creating simple tables and charts.
    • PowerStats—for creating complex tables and logistic and linear regressions.
    • TrendStats—for creating complex tables spanning multiple data collection years. This tool also contains the Tables Library, which houses more than 5,000 published analysis tables by topic, publication, and source.



  • National Assessment of Educational Progress (NAEP) Data Explorer can be used to generate tables, charts, and maps of detailed results from national and state assessments. Users can identify the subject area, grade level, and years of interest and then select variables from the student, teacher, and school questionnaires for analysis.
  • International Data Explorer (IDE) is an interactive tool with data from international assessments and surveys, such as the Program for International Student Assessment (PISA), the Program for the International Assessment of Adult Competencies (PIAAC), and the Trends in International Mathematics and Science Study (TIMSS). The IDE can be used to explore student and adult performance on assessments, create a variety of data visualizations, and run statistical tests and regression analyses.
  • Elementary/Secondary Information System (ElSi) allows users to quickly view public and private school data and create custom tables and charts using data from the Common Core of Data (CCD) and Private School Universe Survey (PSS).
  • Integrated Postsecondary Education Data System (IPEDS) Use the Data provides researcher-focused access to IPEDS data and tools that contain comprehensive data on postsecondary institutions. Users can view video tutorials or use data through one of the many functions within the portal, including the following:
    • Data Trends—Provides trends over time for high-interest topics, including enrollment, graduation rates, and financial aid.
    • Look Up an Institution—Allows for quick access to an institution’s comprehensive profile. Shows data similar to College Navigator but contains additional IPEDS metrics.
    • Statistical Tables—Equips power users to quickly get data and statistics for specific measures, such as average graduation rates by state.

 

 

Collecting School-Level Finance Data: An Evaluation From the Pilot School-Level Finance Survey (SLFS)

Policymakers, researchers, and the public have long voiced concerns about the equitable distribution of school funding within and across school districts. More recently, the Every Student Succeeds Act (ESSA) requires that states and school districts add per pupil expenditures, disaggregated by source of funds, to their annual report cards for each local education agency (LEA) (e.g., school district) and school. In response to this these requirements, the National Center for Education Statistics (NCES) developed a new collection of finance data at the school level—the School-Level Finance Survey (SLFS).

The SLFS collects at the school level many of the same expenditure variables currently being collected at the district level on the School District Finance Survey. The pilot SLFS was designed to evaluate whether the survey is a viable, efficient, and cost-effective method to gather school-level finance data. Findings from the pilot survey were recently released in an NCES report titled The Feasibility of Collecting School-Level Finance Data: An Evaluation of Data From the School-Level Finance Survey (SLFS) School Year 2014–15.

Here’s some of what we learned:

 

Many states participating in the SLFS were able to report complete personnel and/or nonpersonnel expenditure data for a high percentage of their schools.

Of the 15 states that participated in the SLFS in school year 2014–15, 9 states were able to report school-level finance data for greater than 95 percent of their operational schools (figure 1). Other than Colorado and New Jersey,[1] all states were able to report SLFS data for at least 84 percent of their schools, ranging from 85 percent in Kentucky to nearly 100 percent in Maine. Just over one-half of reporting states (8 of 15) reported all personnel items (i.e., dollars spent on salaries and wages for teachers, aides, administrators, and support staff) for at least 95 percent of their schools. Seven of 15 states reported all nonpersonnel items (i.e., dollars spent on purchased services, supplies, and other costs not directly related to school employees) for at least 95 percent of their schools.  
 


Figure 1. Percentage of operational schools with fiscal data reported in the SLFS, by participating state: 2014–15

NOTE: This figure includes operational schools only (i.e., excludes closed, inactive, or future LEAs). The count of schools reported includes schools that can be matched to the Common Core of Data (CCD) School Universe files and for which at least one data item is reported in the SLFS.

SOURCE: U.S. Department of Education, National Center for Education Statistics, Common Core of Data (CCD), “School-Level Finance Survey (SLFS),” fiscal year 2015, Preliminary Version 1a; “Local Education Agency Universe Survey,” 2014–15, Provisional Version 1a.



SLFS data are generally comparable and consistent with other sources of school finance data.

A substantial majority of personnel expenditures can be reported at the school level. Personnel expenditures reported for the SLFS were reasonably comparable with the district-level and state-level data.[2] For common personnel expenditures, the absolute percentage difference between the SLFS and the district survey was less than 9 percent in 8 of 10 states (figure 2). The absolute percentage difference between the SLFS and the state-level survey for common personnel expenditures was less than 9 percent in 6 of 10 states.
 


Figure 2. School-Level Finance Survey (SLFS), School District Finance Survey (F-33), and National Public Education Financial Survey (NPEFS), by participating state: 2014–15

NOTE: Total personnel salaries include instructional staff salaries, student support services salaries, instructional staff support services salaries, and school administration salaries. This figure includes all schools in the SLFS and all LEAs in the F-33. Only states where reporting standards are met are included.

SOURCE: U.S. Department of Education, National Center for Education Statistics, Common Core of Data (CCD), “School-Level Finance Survey (SLFS),” fiscal year 2015, Preliminary Version 1a; “National Public Education Financial Survey (NPEFS),” fiscal year 2015, Final Version 2a; and “School District Finance Survey (F-33),” fiscal year 2015, Provisional Version 1a.



There are numerous inherent challenges in collecting school-level finance data: 

  • Communicating the vision of why reporting school-level finance data is important to school finance practitioners.
  • The pilot SLFS did not collect all types of current expenditures.
  • Some states had not fully developed standardized protocols or procedures for reporting finance data at the school level. 
  • There are varying legal requirements for the types of schools that are required to report finance data and the types of expenditures schools and districts are required to report.
  • The survey’s data item definitions were not consistent with states’ internal accounting for some items.

During the pilot survey, NCES and Census Bureau staff took action to address these challenges. 

 

Evidence suggests that it is feasible to collect accurate and informative school-level financial data.

States participating in the SLFS are improving internal data systems and protocols, which will allow them to report complete and comparable school-level finance data. The SLFS promotes efficiency by incorporating long-established NCES standards for school district financial accounting. The results of the pilot SLFS survey demonstrate that it is feasible to collect accurate and informative school-level finance data. The informational and analytical value will increase as response rates improve and as states improve their capabilities to collect complete, accurate, and comparable finance data at the school level.

 

By Stephen Q. Cornman, NCES; Malia Howell, Stephen Wheeler, and Osei Ampadu, U.S. Census Bureau; and Lei Zhou, Activate Research


[1] In 2014–15, Colorado did not require all school districts to report finance data at the school level; thus, data is reported for only 26 of Colorado’s 262 LEAs. In New Jersey, school-level finance reporting is required only for its “Abbott” districts, which make up only 31 of the state’s 702 districts.

[2] NCES’s Common Core of Data (CCD) program collects school finance data through three annual surveys: the school-level SLFS, the LEA-level School District Finance Survey (F-33), and the state-level National Public Education Financial Survey (NPEFS). Five data items are common to all three fiscal surveys (i.e., are collected at the school level for the SLFS, at the LEA level for the F-33, and at the state level for the NPEFS): instructional staff salaries, student support services salaries, instructional staff support services salaries, school administration salaries, and teacher salaries.

 

 

 

Learning to Use the Data: Online Dataset Training Modules

UPDATED Blog: New and Updated Modules Added

NCES provides a wealth of data online for users to access. However, the breadth and depth of the data can be overwhelming to first time users, and, sometimes, even for more experienced users. In order to help our users learn how to access, navigate, and use NCES datasets, we’ve developed a series of online training modules.

The Distance Learning Dataset Training  (DLDT) resource is an online, interactive tool that allows users to learn about NCES data across the education spectrum and evaluate it for suitability for specific  research purposes. The DLDT program at NCES has developed a growing number of online training modules for several NCES complex sample survey and administrative datasets.  The modules teach users about the intricacies of various datasets, including what the data represent, how the data are collected, the sample design, and considerations for analysis to help users in conducting successful analyses. 

The DLDT is also a teaching tool that can be used by individuals both in and out of the classroom to learn about NCES complex sample survey and administrative data collections and appropriate analysis methods.

There are two types of NCES DLDT modules available: common modules and dataset-specific modules. The common modules help users broadly understand NCES data across the education spectrum, introduce complex survey methods, and explain how to acquire NCES micro-data. The dataset-specific modules introduce and educate users about particular datasets. The available modules are listed below and more information can be found on the DLDT website

 

         AVAILABLE DLDT MODULES

Common Modules

  • Introduction to the NCES Distance Learning Dataset Training System
  • Introduction to the NCES Datasets
  • Introduction to NCES Web Gateways: Accessing and Exploring NCES Data
  • Analyzing NCES Complex Survey Data
  • Statistical Analysis of NCES Datasets Employing a Complex Sample Design
  • Acquiring Micro-level NCES Data
  • DataLab Tools: QuickStats, PowerStats, and TrendStats

Dataset-Specific Modules

  • Common Core of Data (CCD)
  • Introduction to MapED
  • Fast Response Survey System (FRSS)
  • Early Childhood Longitudinal Study Birth Cohort (ECLS-B)
  • Early Childhood Longitudinal Study Kindergarten Class of 1998-1999 (ECLS-K)
  • Early Secondary Longitudinal Studies (1972 – 2000)
    • National Longitudinal Study of 1972 (NLS-72)
    • High School and Beyond (HS&B)
    • National Education Longitudinal Study of 1988 (NELS:88)
  • Educational Longitudinal Study of 2002 (ELS:2002)
  • High School Longitudinal Study of 2009 (HSLS:09)
  • Introduction to High School Transcript Studies
  • Integrated Postsecondary Education Data System (IPEDS) – UPDATED!
  • National Assessment of Educational Progress (NAEP)
    • Main, State, and Long-Term Trend NAEP
    • NAEP High School Transcript Study (HSTS)
    • National Indian Education Study (NIES)
  • National Household Education Survey Program (NHES)
  • National Teacher and Principal Survey (NTPS) – NEW!
  • Postsecondary Education Sample Survey Datasets
    • National Postsecondary Student Aid Study (NPSAS)
    • Beginning Postsecondary Student Longitudinal Study (BPS)
    • Baccalaureate and Beyond Longitudinal Study (B&B)
  • Postsecondary Education Quick Information System (PEQIS)
  • Private School Universe Survey (PSS)
  • Schools and Staffing Survey (SASS)
    • Teacher Follow-up Survey (TFS)
    • Principal Follow-up Survey (PFS)
    • Beginning Teacher Longitudinal Study (BTLS)
  • School Survey On Crime and Safety (SSOCS)
  • International Activities Program Studies Datasets
    • Progress in International Reading Literacy Study (PIRLS)
    • Trends in International Mathematics and Science Study (TIMSS) – UPDATED!
    • Program for International Student Assessment (PISA) – UPDATED!
    • Program for the International Assessment of Adult Competencies (PIAAC)

Modules under Construction

  • Accessing NCES Data via the Web
  • Fast Response Survey System (FRSS)
  • Introduction to the Annual Reports and Information Group
  • NCES Longitudinal Studies
  • NCES High School Transcript Collections
  • Mapping Education Data (MapED)
  • Postsecondary Education Quick Information System (PEQIS)

 

This blog was originally posted on July 12, 2016 and was updated on January 11, 2019.

 

By Andy White