Viewpoints and Findings from the REL Mid-Atlantic

Balancing Act: Shortening State Assessments Without Sacrificing Reliability
By Jacob Hartog

Reducing the time teachers spend administering assessments can alleviate the stress of testing on students and educators and reduce lost instructional time, an especially high priority following school closures during the pandemic. Testing time increased significantly in the past decade after adoption of Common Core–based assessments. Education leaders must weigh the benefits of assessment, such as better information for targeted support and improvement for schools, educators, and students, against the loss of instructional time and other burdens of testing on schools, staff, and students. Understanding these trade-offs can help states and districts across the nation plan to return to full-time in-person instruction and inform state-level assessments.

In 2019, well before the onset of the global pandemic, the New Jersey Department of Education collaborated with the Partnership for Assessment of Readiness for College and Careers (PARCC) consortium and the test developer New Meridian to create a shorter test called the New Jersey Student Learning Assessments, which was based on the same academic standards and shared a common set of questions with the PARCC assessments. This change reduced testing times by 25 to 33 percent per test, giving back up to two hours for instruction. After the change, the New Jersey Department of Education wanted to verify that the group-level results would be stable and reliable for schools, student subgroups, and subject tests for each grade using the new tests. These results are publicly reported and often used for school accountability. The New Jersey Department of Education partnered with REL Mid-Atlantic to understand the effects of this change in assessments on the reliability of results. If the results remained stable and reliable, they would be consistent over time, and New Jersey Student Learning Assessments and PARCC results would be comparable to results obtained using the previous PARCC assessments.

The results are encouraging. Group-level assessment scores were highly stable before and during the years of the test change. As the figure below shows, the correlation of group-level results from one year to the next did not meaningfully change from 2018 to 2019 (when the test was shortened) relative to the correlations in prior years when the test was longer. In other words, group-level aggregates of test results (such as average scores by grade level or student subgroup) did not become less reliable even when the test was shortened. Even test components, such as reading and writing, and sub-test outcomes, such as reading informational text or mathematical reasoning, were consistently stable when reported at the group level. And the stability of results for smaller schools and smaller reported groups of students did not decrease.

These analyses did not look at the stability of individual students' results, which could be more affected by the reduction in the length of the tests. Nonetheless, New Jersey's experience suggests that reducing testing time—carefully maintaining the same academic standards and bank of test items—does not necessarily reduce the reliability of the school, student group, or test level results that states typically use for public reporting and accountability. Other states might consider these results as they weigh the trade-offs of reducing testing time to increase time spent on teaching and learning.

For more about this study and the shift from the PARCC to the New Jersey Student Learning Assessments, see the full report at https://ies.ed.gov/ncee/edlabs/projects/project.asp?projectID=6721.

Group-level test results (between 2018 and 2019)

Figure note: Group-level test results were as highly correlated in the years after the test was shortened (between 2018 and 2019) as they were in the years before the test was changed.

References

Fox , L., Hartog, J., & Larkin, N. (2021). The reliability of shorter assessments in New Jersey for group-level inferences. Regional Educational Laboratory Mid-Atlantic. https://ies.ed.gov/ncee/edlabs/projects/project.asp?projectID=6721

Olson, L., & Jerald, C. (2020). The big test: The future of statewide standardized assessments. Future-Ed Georgetown University. https://www.future-ed.org/wp-content/uploads/2020/04/TheBigTest_Final-1.pdf

Related Blogs