Skip Navigation
Funding Opportunities | Search Funded Research Grants and Contracts

IES Grant

Title: Developing More Effective Test-Based Accountability by Improving Validity Under High-Stakes Conditions
Center: NCER Year: 2011
Principal Investigator: Koretz, Daniel Awardee: Harvard University
Program: Improving Education Systems      [Program Details]
Award Period: 4 years Award Amount: $1,564,713
Type: Measurement Award Number: R305A110420
Description:

Co-Principal Investigator: Jennifer Jennings (New York University)

Purpose: Research on test-based accountability has documented a variety of distortions of educational practice that can lead to test score inflation. Specifically, this means that gains in scores on the tests used for accountability can be far larger than the actual gains in student learning they are intended to signal. The goal of this project is to: (1) evaluate the limitations of current assessment systems, including the features of tests that are most vulnerable to score inflation and the types of schools and students most affected by such inflation; (2) develop and evaluate new approaches for validating inferences based on scores on tests used for accountability; and (3) design new research-based approaches to assessment to lessen the unwanted side effects of test-based accountability.

Project Activities: Researchers will first examine test forms to identify predictable patterns that could create inflation. Second, researchers will examine differential gains on parts of the assessments, since preparation focused on predictable patterns within the test should be accompanied by differentially large performance gains on items reflecting these opportunities. Third, the research team will develop and evaluate "self-monitoring assessments" (i.e., assessments that incorporate audit components directly into operational assessments and thus can localize inflation at the level of schools) and compare performance on parts of the operational assessment with these audit components. Fourth, researchers will compare performance on parts of the operational assessment and parts of the National Assessment of Educational Progress (NAEP). Finally, they will evaluate the extent to which trends in performance on accountability tests generalize to later outcomes, such as high school and college performance.

Products: Products from this project will include fully developed and validated tests that are better suited for use in accountability systems, especially student score inflation. Peer-reviewed publications will also be produced.

Structured Abstract

Setting: This study uses student-by-test-item data from three states: New York, Texas, and Massachusetts. The New York State Education Department (NYSED) provided data and the opportunity to field new assessment designs.

Population: The NYSED provided statewide student-by-test-item data for the 2006–10 cohorts of students in grades 6 through 8. This included demographic and other student data that was linked to schools, and data from all mathematics and English Language Arts Regents (high school) examinations administered since 2006. Texas provided student-by-test item and administrative data for all students in the state from 1994–10. Finally, in Massachusetts, researchers built a school-by-test item data panel for grades 3–10 (excluding 9th grade) over the period 2003–2009. These data are linked to characteristics of the schools such as student demographics and teacher characteristics. Researchers also have student-level data nested within Massachusetts schools.

Intervention: The core focus of this project is to develop improved assessments for purposes of test-based accountability. This requires several elements, including: (1) rigorously evaluating the limitations and weaknesses of current assessment systems, the attributes of the system that are most problematic, and the types of schools and students most affected by them; (2) developing and evaluating new approaches for validating inferences based on scores on tests used for accountability; and (3) designing new, research-based approaches to assessment that may lessen the unwanted side-effects of test-based accountability and increase the desired effects on both overall achievement and equity.

Research Design and Methods: Researchers will extend traditional validation methods by examining variations in generalizability within tests (individual items) rather than only the generalizability of total scores. First, researchers will examine test forms to identify predictable patterns that could create inflation. Second, researchers will examine differential gains on parts of the assessments, since preparation focused on predictable patterns within the test should be accompanied by differentially large performance gains on items reflecting these opportunities. Third, the research team will develop and evaluate "self-monitoring assessments" (i.e., assessments that incorporate audit components directly into operational assessments and thus can localize inflation at the level of schools) and compare performance on parts of the operational assessment with these audit components. Fourth, researchers will compare performance on parts of the operational assessment and parts of NAEP. Finally, they will evaluate the extent to which trends in performance on accountability tests generalize to later outcomes, such as high school and college performance.

Control Condition: There is no true control condition, but NAEP will be used for comparisons with state assessments, and items will be compared with one another.

Key Measures: Key measures include state assessments from New York, Texas, and Massachusetts as well as the state and national NAEP assessments.

Data Analytic Strategy: Researchers will use several methods to analyze differential trends in performance. One approach is an adaptation of differential item functioning methods commonly used to flag items for potential bias. Researchers will also apply multi-level mixed models to evaluate contrasts for full assessments and individual assessment items.

Publications

Book chapter

Holcombe, R., Jennings, J., & Koretz, D. (2013). The Roots of Score Inflation: An Examination of Opportunities in Two States' Tests. In G. Sunderman (Ed.), Charting Reform, Achieving Equity in a Diverse Nation (pp. 163–189). Information Age Publishing.

Jennings, J. L., & Sohn, H. (2013). A Tale of Two Tests: Test Scores, Accountability, and Inequality in American Education. In D. Anagnostopoulos, S. A. Rutledge, and R. Jacobsen (Eds.), The Infrastructure of Accountability: Data Use and the Transformation of American Education . Harvard Educatoin Press.

Journal article, monograph, or newsletter

Jennings, J. L., and Lauen, D. L. (2016). Accountability, Inequality, and Achievement: The Effects of the No Child Left Behind Act on Multiple Measures of Student Learning. Russell Sage Foundation Journal of the Social Sciences, 2 (5): 220–241.

Jennings, J. L., and Sohn H. (2014). Measure for Measure: How Proficiency-Based Accountability Systems Affect Inequality in Academic Achievement. Sociology of Education, 87 (2): 125–141.

Jennings, J. L.,and Bearak, J. M. (2014). "Teaching to the Test" in the NCLB Era: How Test Predictability Affects Our Understanding of Student Performance. Educational Researcher, 43 (8): 381–389.

Koretz, D. (2015). Adapting Educational Measurement to the Demands of Test-Based Accountability. Measurement: Interdisciplinary Research and Perspectives, 13 (1): 1–25.

Koretz, D., Jennings, J. L., Ng, H. L., Yu, C., Braslow, D., and Langi, M. (2016). Auditing for Score Inflation Using Self-Monitoring Assessments: Findings from Three Pilot Studies. Educational Assessment, 21 (4): 231–247.

Koretz, D., Yu, C., Mbekeani, P., Langi, M., Dhaliwal, T., and Braslow, D. (2016). Predicting Freshman Grade Point Average from College Admissions Test Scores and State High School Test Scores. AERA Open, 2 (4).

Ng, H. L., and Koretz, D. (2015). Sensitivity of School-Performance Ratings to Scaling Decisions. Applied Measurement in Education, 28 (4): 330–349.

** This project was submitted to and funded under Education Policy, Finance, and Systems in FY 2011.


Back