Skip Navigation
Print Evaluations

Evaluation of NAEP Achievement Levels for Mathematics and Reading

Contract Information

Current Status:

This study has been completed.

Duration:

September 2014 – April 2017

Cost:

$1,256,345

Contract Number:

ED-IES-14-C-0124

Contractor(s):

National Research Council

Reports

Under the provisions of the Education Sciences Reform Act (P.L. 107-279), the Secretary of the U.S. Department of Education (ED) is required to provide for continuing review of the National Assessment of Educational Progress (NAEP). The law identifies the issues to be addressed in the reviews, one of which includes the requirement to evaluate whether the NAEP achievement levels, established by the National Assessment Governing Board (NAGB), are "reasonable, valid, reliable and informative to the public." Section 303(e)(2)(C) of the Education Sciences Reform Act further states that NAEP achievement levels shall be used on a trial basis until the Commissioner of the National Center for Education Statistics (NCES) determines, as a result of the evaluation, that such levels are "reasonable, valid, and informative to the public."

This study provided an independent and objective evaluation of the NAEP achievement levels, with the intention of providing the NCES Commissioner with information necessary to inform the decision about whether the current trial status of the NAEP achievement levels can be removed or whether they should remain in trial status.

  • How "reasonable, valid, reliable and informative to the public" are the NAEP achievement levels for mathematics and reading?

This study focused on the achievement levels used in reporting NAEP results for the reading and mathematics assessments in grades 4, 8, and 12. Specifically, the study team reviewed developments over the past decade in the ways achievement levels for NAEP are set and used, and then evaluated whether the resulting achievement levels were "reasonable, valid, reliable, and informative to the public." The study relied on an independent committee of experts with a broad range of expertise related to assessment, statistics, social science, and education policy. The project received oversight from the Board on Testing and Assessment and the Committee on National Statistics of the National Research Council.

  • The procedures used by the National Assessment Governing Board for setting the achievement levels in 1992 are well documented. The documentation includes the kinds of evidence called for in the Standards for Educational and Psychological Testing in place at the time and currently and was in line with the research and knowledge base at the time.
  • The available documentation of the 1992 standard settings in reading and mathematics include the types of reliability analyses called for in the Standards for Educational and Psychological Testing that were in place at the time and those that are currently in place. The evidence that resulted from these analyses, however, showed considerable variability among panelists' cut-score judgments: the expected pattern of decreasing variability among panelists across the rounds was not consistently achieved; and panelists' cut-score estimates were not consistent over different item formats and different levels of item difficulty. These issues were not resolved before achievement-level results were released to the public.
  • The studies conducted to assess content validity are in line with those called for in the Standards for Educational and Psychological Testing in place in 1992 and currently in 2016. The results of these studies suggested that changes in the achievement-level descriptors (ALDs) were needed, and they were subsequently made. These changes may have better aligned the descriptors to the framework and exemplar items, but as a consequence, the final ALDs were not the ones used to set the cut-scores. Since 1992, there have been additional changes to the frameworks, the item pools, the assessments, and studies to identify needed revisions to the ALDs. But, to date, there has been no effort to set new cut-scores using the most current ALDs.
  • The National Assessment of Educational Progress achievement levels are widely disseminated to and used by many audiences, but the interpretive guidance about the meaning and appropriate uses of those levels provided to users is inconsistent and piecemeal. Without appropriate guidance, misuses are likely.