NCES Blog

National Center for Education Statistics

From Data Collection to Data Release: What Happens?

In today’s world, much scientific data is collected automatically from sensors and processed by computers in real time to produce instant analytic results. People grow accustomed to instant data and expect to get things quickly.

At the National Center for Education Statistics (NCES), we are frequently asked why, in a world of instant data, it takes so long to produce and publish data from surveys. Although improvements in the timeliness of federal data releases have been made, there are fundamental differences in the nature of data compiled by automated systems and specific data requested from federal survey respondents. Federal statistical surveys are designed to capture policy-related and research data from a range of targeted respondents across the country, who may not always be willing participants.

This blog is designed to provide a brief overview of the survey data processing framework, but it’s important to understand that the survey design phase is, in itself, a highly complex and technical process. In contrast to a management information system, in which an organization has complete control over data production processes, federal education surveys are designed to represent the entire country and require coordination with other federal, state, and local agencies. After the necessary coordination activities have been concluded, and the response periods for surveys have ended, much work remains to be done before the survey data can be released.

Survey Response

One of the first sources of potential delays is that some jurisdictions or individuals are unable to fill in their surveys on time. Unlike opinion polls and online quizzes, which use anyone who feels like responding to the survey (convenience samples), NCES surveys use rigorously formulated samples meant to properly represent specific populations, such as states or the nation as a whole. In order to ensure proper representation within the sample, NCES follows up with nonresponding sampled individuals, education institutions, school districts, and states to ensure the maximum possible survey participation within the sample. Some large jurisdictions, such as the New York City school district, also have their own extensive survey operations to conclude before they can provide information to NCES. Before the New York City school district, which is larger than about two-thirds of all state education systems, can respond to NCES surveys, it must first gather information from all its schools. Receipt of data from New York City and other large districts is essential to compiling nationally representative data.

Editing and Quality Reviews

Waiting for final survey responses does not mean that survey processing comes to a halt. One of the most important roles NCES plays in survey operations is editing and conducting quality reviews of incoming data, which take place on an ongoing basis. In these quality reviews, a variety of strategies are used to make cost-effective and time-sensitive edits to the incoming data. For example, in the Integrated Postsecondary Education Data System (IPEDS), individual higher education institutions upload their survey responses and receive real-time feedback on responses that are out of range compared to prior submissions or instances where survey responses do not align in a logical way. All NCES surveys use similar logic checks in addition to a range of other editing checks that are appropriate to the specific survey. These checks typically look for responses that are out of range for a certain type of respondent.

Although most checks are automated, some particularly complicated or large responses may require individual review. For IPEDS, the real-time feedback described above is followed by quality review checks that are done after collection of the full dataset. This can result in individualized follow up and review with institutions whose data still raise substantive questions. 

Sample Weighting

In order to lessen the burden on the public and reduce costs, NCES collects data from selected samples of the population rather than taking a full census of the entire population for every study. In all sample surveys, a range of additional analytic tasks must be completed before data can be released. One of the more complicated tasks is constructing weights based on the original sample design and survey responses so that the collected data can properly represent the nation and/or states, depending on the survey. These sample weights are designed so that analyses can be conducted across a range of demographic or geographic characteristics and properly reflect the experiences of individuals with those characteristics in the population.

If the survey response rate is too low, a “survey bias analysis” must be completed to ensure that the results will be sufficiently reliable for public use. For longitudinal surveys, such as the Early Childhood Longitudinal Study, multiple sets of weights must be constructed so that researchers using the data will be able to appropriately account for respondents who answered some but not all of the survey waves.

NCES surveys also include “constructed variables” to facilitate more convenient and systematic use of the survey data. Examples of constructed variables include socioeconomic status or family type. Other types of survey data also require special analytic considerations before they can be released. Student assessment data, such as the National Assessment of Educational Progress (NAEP), require that a number of highly complex processes be completed to ensure proper estimations for the various populations being represented in the results. For example, just the standardized scoring of multiple choice and open-ended items can take thousands of hours of design and analysis work.

Privacy Protection

Release of data by NCES carries a legal requirement to protect the privacy of our nation’s children. Each NCES public-use dataset undergoes a thorough evaluation to ensure that it cannot be used to identify responses of individuals, whether they are students, parents, teachers, or principals. The datasets must be protected through item suppression, statistical swapping, or other techniques to ensure that multiple datasets cannot be combined in such a way as to identify any individual. This is a time-consuming process, but it is incredibly important to protect the privacy of respondents.

Data and Report Release

When the final data have been received and edited, the necessary variables have been constructed, and the privacy protections have been implemented, there is still more that must be done to release the data. The data must be put in appropriate formats with the necessary documentation for data users. NCES reports with basic analyses or tabulations of the data must be prepared. These products are independently reviewed within the NCES Chief Statistician’s office.

Depending on the nature of the report, the Institute of Education Sciences Standards and Review Office may conduct an additional review. After all internal reviews have been conducted, revisions have been made, and the final survey products have been approved, the U.S. Secretary of Education’s office is notified 2 weeks in advance of the pending release. During this notification period, appropriate press release materials and social media announcements are finalized.

Although NCES can expedite some product releases, the work of preparing survey data for release often takes a year or more. NCES strives to maintain a balance between timeliness and providing the reliable high-quality information that is expected of a federal statistical agency while also protecting the privacy of our respondents.  

 

By Thomas Snyder

Data Tools for College Professors and Students

Ever wonder what parts of the country produce the most English majors? Want to know which school districts have the most guidance counselors? The National Center for Education Statistics (NCES) has all the tools you need to dig into these and lots of other data!

Whether you’re a student embarking on a research project or a college professor looking for a large data set to use for an assignment, NCES has you covered. Below, check out the tools you can use to conduct searches, download datasets, and generate your own statistical tables and analyses.

 

Conduct Publication Searches

Two search tools help researchers identify potential data sources for their study and explore prior research conducted with NCES data. The Publications & Products Search Tool can be used to search for NCES publications and data products. The Bibliography Search Tool, which is updated continually, allows users to search for individual citations from journal articles that have been published using data from most surveys conducted by NCES.

Key reference publications include the Digest of Education Statistics, which is a comprehensive library of statistical tabulations, and The Condition of Education, which highlights up-to-date trends in education through statistical indicators.

 

Learn with Instructional Modules

The Distance Learning Dataset Training System (DLDT) is an interactive online tool that allows users to learn about NCES data across the education spectrum. DLDT’s computer-based training introduces users to many NCES datasets, explains their designs, and offers technical considerations to facilitate successful analyses. Please see the NCES blog Learning to Use the Data: Online Dataset Training Modules for more details about the DLDT tool.
 




Download and Access Raw Data Files

Users have several options for conducting statistical analyses and producing data tables. Many NCES surveys release public-use raw data files that professors and students can download and analyze using statistical software packages like SAS, STATA, and SPSS. Some data files and syntax files can also be downloaded using NCES data tools:

  • Education Data Analysis Tool (EDAT) and the Online Codebook allow users to download several survey datasets in various statistical software formats. Users can subset a dataset by selecting a survey, a population, and variables relevant to their analysis.
  • Many data files can be accessed directly from the Surveys & Programs page by clicking on the specific survey and then clicking on the “Data Products” link on the survey website.

 

Generate Analyses and Tables

NCES provides several online analysis tools that do not require a statistical software package:

  • DataLab is a tool for making tables and regressions that features more than 30 federal education datasets. It includes three powerful analytic tools:
    • QuickStats—for creating simple tables and charts.
    • PowerStats—for creating complex tables and logistic and linear regressions.
    • TrendStats—for creating complex tables spanning multiple data collection years. This tool also contains the Tables Library, which houses more than 5,000 published analysis tables by topic, publication, and source.



  • National Assessment of Educational Progress (NAEP) Data Explorer can be used to generate tables, charts, and maps of detailed results from national and state assessments. Users can identify the subject area, grade level, and years of interest and then select variables from the student, teacher, and school questionnaires for analysis.
  • International Data Explorer (IDE) is an interactive tool with data from international assessments and surveys, such as the Program for International Student Assessment (PISA), the Program for the International Assessment of Adult Competencies (PIAAC), and the Trends in International Mathematics and Science Study (TIMSS). The IDE can be used to explore student and adult performance on assessments, create a variety of data visualizations, and run statistical tests and regression analyses.
  • Elementary/Secondary Information System (ElSi) allows users to quickly view public and private school data and create custom tables and charts using data from the Common Core of Data (CCD) and Private School Universe Survey (PSS).
  • Integrated Postsecondary Education Data System (IPEDS) Use the Data provides researcher-focused access to IPEDS data and tools that contain comprehensive data on postsecondary institutions. Users can view video tutorials or use data through one of the many functions within the portal, including the following:
    • Data Trends—Provides trends over time for high-interest topics, including enrollment, graduation rates, and financial aid.
    • Look Up an Institution—Allows for quick access to an institution’s comprehensive profile. Shows data similar to College Navigator but contains additional IPEDS metrics.
    • Statistical Tables—Equips power users to quickly get data and statistics for specific measures, such as average graduation rates by state.

 

 

New Report Shows Increased Diversity in U.S. Schools, Disparities in Outcomes

The school-age population in the United States is becoming more racially and ethnically diverse. An NCES report released in February 2019, Status and Trends in the Education of Racial and Ethnic Groups 2018, examines how education experiences and outcomes vary among racial/ethnic groups. The report contains 36 indicators that cover preprimary to postsecondary education, as well as family background characteristics and labor force outcomes.

Between 2000 and 2017, the percentage of 5- to 17-year-olds who were White decreased from 62 to 51 percent, while the percentage who were Hispanic increased from 16 to 25 percent.

 


Figure 1. Percentage distribution of the U.S. resident population ages 5–17, by race/ethnicity: 2000 and 2017

# Rounds to zero.

NOTE: Data are for the resident population as of July 1 of the indicated year.

SOURCE: U.S. Department of Commerce, Census Bureau, 2000 Population Estimates, retrieved August 14, 2012, from http://www.census.gov/popest/data/national/asrh/2011/index.html; and 2017 Population Estimates, retrieved September 5, 2017, from https://www.census.gov/data/datasets/2016/demo/popest/nation-detail.html. See Digest of Education Statistics 2017, table 101.20.


 

Prior research shows that living in poverty during early childhood is associated with lower-than-average academic performance that begins in kindergarten[1] and extends through high school, leading to lower-than-average rates of school completion.[2] In 2016, the percentages of children living in poverty were highest for Black and American Indian/Alaska Native children and lowest for White and Asian children.

 


Figure 2. Percentage of children under age 18 living in poverty, by race/ethnicity: 2016

NOTE: Data shown are based only on related children in a family; that is, all children in the household who are related to the householder by birth, marriage, or adoption (except a child who is the spouse of the householder).

SOURCE: U.S. Department of Commerce, Census Bureau, American Community Survey (ACS), 2016. See Digest of Education Statistics 2017, table 102.60.


 

The National Assessment of Educational Progress (NAEP)—given to a representative sample of students across the United States—measures student performance over time in various subjects (including reading, math, and science) at grades 4, 8, and 12. Average grade 4 reading scores were higher in 2017 than in 1992 for the racial/ethnic groups with available data. Between 1992 and 2017, the White-Black score gap narrowed from 32 points in 1992 to 26 points in 2017. However, the White-Hispanic gap in 2017 was not measurably different from the corresponding gap in 1992.

 


Figure 3. Average National Assessment of Educational Progress (NAEP) reading scale scores of grade 4 students, by selected race/ethnicity: 1992 and 2017

NOTE: Includes public and private schools. Testing accommodations (e.g., extended time, small group testing) for children with disabilities and English language learners were not permitted in 1992.

SOURCE: U.S. Department of Education, National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1992 and 2017 Reading Assessments, NAEP Data Explorer. See Digest of Education Statistics 2017, table 221.10.


 

Looking at higher education, between 2000 and 2016, the largest changes in the racial/ethnic composition of undergraduate students were for White students and Hispanic students. The share of undergraduates who were White decreased from 70 to 56 percent, and the share who were Hispanic increased from 10 to 19 percent.

 


Figure 4. Percentage of total undergraduate student enrollment in degree-granting institutions, by race/ethnicity: Fall 2000 and fall 2016

NOTE: Other includes Asian students, Pacific Islander students, and students of Two or more races.

SOURCE: U.S. Department of Education, National Center for Education Statistics, Integrated Postsecondary Education Data System (IPEDS), Spring 2001 and Spring 2017, Fall Enrollment component. See Digest of Education Statistics 2017, table 306.10.


 

Postsecondary graduation rates vary widely by racial/ethnic group. For instance, among first-time students at 4-year institutions who enrolled in 2010, 74 percent of Asian students had graduated within 6 years. This was approximately 35 percentage points higher than the graduation rates for American Indian/Alaska Native students and Black students.   

 


Figure 5: Graduation rates within 6 years from first institution attended for first-time, full-time bachelor's degree-seeking students at 4-year postsecondary institutions, by race/ethnicity: Cohort entry year 2010

SOURCE: U.S. Department of Education, National Center for Education Statistics, Integrated Postsecondary Education Data System (IPEDS), Winter 2016–17, Graduation Rates component. See Digest of Education Statistics 2017, table 326.10.


 

The report also includes a new spotlight indicator, which highlights institutions that serve a large number of students from minority racial and ethnic groups. For instance, historically Black colleges and universities (HBCUs) are defined as “any historically Black college or university that was established prior to 1964, whose principal mission was, and is, the education of Black Americans.” In fall 2016, there were 102 HBCUs that enrolled over 292,000 students, 77 percent of whom were Black.

 



 

The spotlight also highlights other groups of minority-serving institutions—Hispanic-serving institutions, Tribally controlled colleges and universities, and Asian American and Native American Pacific Islander-serving institutions—describes how an institution is recognized as belonging to one of these groups, and discusses other institution characteristics, such as enrollment and degrees conferred.

For more information, visit the report’s website, where you can browse the indicators or download the full report

 

By Cris de Brey

 


[1] Mulligan, G.M., Hastedt, S., and McCarroll, J.C. (2012). First-Time Kindergartners in 2010–11: First Findings From the Kindergarten Rounds of the Early Childhood Longitudinal Study, Kindergarten Class of 2010–11 (ECLS-K:2011) (NCES 2012-049). U.S. Department of Education. Washington, DC: National Center for Education Statistics. Retrieved from https://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2012049.

[2] Ross, T., Kena, G., Rathbun, A., KewalRamani, A., Zhang, J., Kristapovich, P., and Manning, E. (2012). Higher Education: Gaps in Access and Persistence Study (NCES 2012-046). U.S. Department of Education. Washington, DC: National Center for Education Statistics. Retrieved from https://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2012046.

Announcing the Condition of Education 2019 Release

We are pleased to present The Condition of Education 2019, a congressionally mandated annual report summarizing the latest data on education in the United States. This report is designed to help policymakers and the public monitor educational progress. This year’s report includes 48 indicators on topics ranging from prekindergarten through postsecondary education, as well as labor force outcomes and international comparisons.

In addition to the regularly updated annual indicators, this year’s spotlight indicators show how recent NCES surveys have expanded our understanding of outcomes in postsecondary education.

The first spotlight examines the variation in postsecondary enrollment patterns between young adults who were raised in high- and low-socioeconomic status (SES) families. The study draws on data from the NCES High School Longitudinal Study of 2009, which collected data on a nationally representative cohort of ninth-grade students in 2009 and has continued to survey these students as they progress through postsecondary education. The indicator finds that the percentage of 2009 ninth-graders who were enrolled in postsecondary education in 2016 was 50 percentage points larger for the highest SES students (78 percent) than for the lowest SES students (28 percent). Among the highest SES 2009 ninth-graders who had enrolled in a postsecondary institution by 2016, more than three-quarters (78 percent) first pursued a bachelor’s degree and 13 percent first pursued an associate’s degree. In contrast, the percentage of students in the lowest SES category who first pursued a bachelor’s degree (32 percent) was smaller than the percentage who first pursued an associate’s degree (42 percent). In addition, the percentage who first enrolled in a highly selective 4-year institution was larger for the highest SES students (37 percent) than for the lowest SES students (7 percent).

The complete indicator, Young Adult Educational and Employment Outcomes by Family Socioeconomic Status, contains more information about how enrollment, persistence, choice of institution (public, private nonprofit, or private for-profit and 2-year or 4-year), and employment varied by the SES of the family in which young adults were raised.

 


Among 2009 ninth-graders who had enrolled in postsecondary education by 2016, percentage distribution of students' first credential pursued at first postsecondary institution, by socioeconomic status: 2016

1 Socioeconomic status was measured by a composite score of parental education and occupations and family income in 2009.
NOTE: Postsecondary outcomes are as of February 2016, approximately 3 years after most respondents had completed high school. Although rounded numbers are displayed, the figures are based on unrounded data. Detail may not sum to totals because of rounding.

SOURCE: U.S. Department of Education, National Center for Education Statistics, High School Longitudinal Study of 2009 (HSLS:09), Base Year and Second Follow-up. See Digest of Education Statistics 2018, table 302.44.


 

The second spotlight explores new data on postsecondary outcomes, including completion and transfer rates, for nontraditional undergraduate students. While the Integrated Postsecondary Education Data System formerly collected outcomes data only for first-time, full-time students, a new component of the survey includes information on students who enroll part time, transfer among institutions, or leave postsecondary education temporarily but later enroll again. These expanded data are particularly important for 2-year institutions, where higher percentages of students are nontraditional. For example, the indicator finds that, among students who started at public 2-year institutions in 2009, completion rates 8 years after entry were higher among full-time students (30 percent for first-time students and 38 percent for non-first-time students) than among part-time students (16 percent for first-time students and 21 percent for non-first-time students). Also at public 2-year institutions, transfer rates 8 years after entry were higher among non-first-time students (37 percent for part-time students and 30 percent for full-time students) than among first-time students (24 percent for both full-time and part-time students).

For more findings, including information on outcomes for nontraditional students at 4-year institutions, read the complete indicator, Postsecondary Outcomes for Nontraditional Undergraduate Students.

 


Percentage distribution of students' postsecondary outcomes 8 years after beginning at 2-year institutions in 2009, by initial attendance level and status: 2017

# Rounds to zero.
1 Attendance level (first-time or non-first-time student) and attendance status (full-time or part-time student) are based on the first full term (i.e., semester or quarter) after the student entered the institution. First-time students are those who had never attended a postsecondary institution prior to their 2009–10 entry into the reporting institution.
2 Includes certificates, associate’s degrees, and bachelor’s degrees. Includes only those awards that were conferred by the reporting institution (i.e., the institution the student entered in 2009–10); excludes awards conferred by institutions to which the student later transferred.
3 Refers to the percentage of students who were known transfers (i.e., those who notified their initial postsecondary institution of their transfer). The actual transfer rate (including students who transferred, but did not notify their initial institution) may be higher.
4 Includes students who dropped out of the reporting institution and students who transferred to another institution without notifying the reporting institution.
NOTE: The 2009 entry cohort includes all degree/certificate-seeking undergraduate students who entered a degree-granting institution between July 1, 2009, and June 30, 2010. Student enrollment status and completion status are determined as of August 31 of the year indicated; for example, within 8 years after the student’s 2009–10 entry into the reporting institution means by August 31, 2018. Detail may not sum to totals because of rounding. Although rounded numbers are displayed, the figures are based on unrounded data.

SOURCE: U.S. Department of Education, National Center for Education Statistics, Integrated Postsecondary Education Data System (IPEDS), Winter 2017–18, Outcome Measures component; and IPEDS Fall 2009, Institutional Characteristics component. See Digest of Education Statistics 2018, table 326.27.


 

The Condition of Education includes an At a Glance section, which allows readers to quickly make comparisons within and across indicators, and a Highlights section, which captures key findings from each indicator. The report also contains a Reader’s Guide, a Glossary, and a Guide to Sources that provide additional background information. Each indicator provides links to the source data tables used to produce the analyses.

As new data are released throughout the year, indicators will be updated and made available on The Condition of Education website. In addition, NCES produces a wide range of reports and datasets designed to help inform policymakers and the public. For more information on our latest activities and releases, please visit our website or follow us on TwitterFacebook, and LinkedIn.

 

By James L. Woodworth, NCES Commissioner

IPEDS Finance Data Reveal How Pension Benefits May Contribute to the Growth of Public Postsecondary Institutions’ Financial Liabilities

In the long-standing conversation of high college costs, ever wonder what public colleges and universities owe? For Fiscal Year (FY) 2017, the National Center for Education Statistics (NCES) using the Integrated Postsecondary Education System (IPEDS) found that 1,624[1] public institutions carried debt and total financial obligations of $451 billion in current dollars (see figure 1).

New finance data from IPEDS can now provide more insight about these obligations than was previously available.

Several common financial obligations or liabilities[2] can be found across all U.S. postsecondary institutions. A portion of an institution’s liabilities can be attributed to pension benefits and contributions (i.e., pension liabilities). Since fiscal year 2015, IPEDS collected data on these obligations as a specific part of the total debt held by public postsecondary institutions.  For example, the total amount of pension benefits and contributions that public institutions owed their employees in FY 2017 was $95 billion (see figure 1).

 



 

Before FY 2015, institutions did not have to report to NCES their pension liabilities and the total liabilities for public institutions were $304 billion in FY 2014.  However, after the change in reporting standards, the total liabilities for all public institutions jumped to $395 billion in FY 2015. This increase is greater than increases in all other fiscal years from 2012 to 2017. This finding suggests that the implementation of the new pension reporting standards may have contributed to the change in the increasing trend of total liabilities data.

Reporting Change in Context

Prior to the revised pension reporting standards, dating back to 1997, public institutions reported the difference between their annual required contribution to the pension plan(s) and the actual annual contribution (e.g., net pension obligation). The revised standards—known as Government Accounting Standards Board (GASB) Statements 67 and 68—require institutions to report the entire unfunded pension amount (e.g., net pension liability), not just the amount of deficiency in annual payments.

Including the full current pension liability of the institution instead of the annual shortfall in pension funding of the institution resulted in large shifts in the balance sheet of many public institutions. For example, if an institution had a total of $2 million in pension liabilities, prior to 2015 this institution would not report the $2 million in net pension liabilities, just the amount below the required contribution for that year that was actually paid. Now, this institution must report the full $2 million in net pension liabilities, even if the annual required contribution had been paid in full. This revision of the financial reporting standards resulted in increased transparency and accuracy of the total amount of liabilities reported by institutions.

Additional IPEDS Resources

NCES encourages educational researchers to use IPEDS data—a primary source on U.S. colleges, universities, and technical and vocational institutions. For more information about the IPEDS data, visit the IPEDS Survey Components page.

While finance data from the IPEDS collection may seem to be targeted for accountants and business officers, researchers interested in a postsecondary institution’s financial health can explore through expense and revenue metrics, resulting in possible data-driven, bellwether information. To learn more about an institution’s finance data, in particular its pension benefits, click here for the current finance survey materials; archived changes to the survey materials in 2015–16 (FY 2015)—such as the implementation of the new pension reporting standards; and links to Video Tutorials, FAQs, glossary definitions and other helpful resources.  

 

 By Bao Le, Aida Ali Akreyi, and Gigi Jones


[1] This total includes 735 four-year public institutions, 889 two-year public institutions, and 63 administrative public system offices (41 four-year and 22 two-year offices). Administrative system offices can report on behalf of their campuses. The four non-Title IV-eligible U.S. service academics are not included.

[2] Liabilities include long-term debts (current and noncurrent) as well as other current and noncurrent liabilities such as pensions, compensated absences, claims and judgments, etc.