Skip Navigation
Funding Opportunities | Search Funded Research Grants and Contracts

IES Grant

Title: The Distributional Implications of Computer-based Testing on Students, Teachers, and Schools
Center: NCER Year: 2017
Principal Investigator: Backes, Benjamin Awardee: American Institutes for Research (AIR)
Program: Improving Education Systems      [Program Details]
Award Period: 2 years (07/01/2017 – 06/30/2019) Award Amount: $429,644
Type: Exploration Award Number: R305A170119
Description:

Co-Principal Investigator: Cowan, James

Purpose: In this study, the researchers examined the rollout of computer-based tests (CBTs) in Massachusetts to investigate test mode effects on student achievement outcomes.  CBTs have been spreading rapidly as a way to assess student achievement because they have a more flexibility in test item design, access to a large repository of items, and faster turnarounds for score receipt. As of the 2020s, dozens of states administer online exams. Through this project, the researchers aimed to help inform state policymakers' decisions of whether or how to implement CBTs, with a special focus on the implications of CBT for achievement gaps.

Project Activities: The researchers used state administrative data to (1) examine the extent to which achievement of various student subgroups is systematically different from achievement measured by paper-and-pencil–based tests and to explore any factors that might mediate these gaps, including explanations such as previous exposure to online tests; (2) document the extent to which accountability systems for students, teachers, and schools are affected by test mode effects; and (3) explore the relative predictive validity of paper and online tests for later outcomes. To address these aims, they used difference-in-difference analyses comparing students in online PARCC schools to paper PARCC schools in the same year and measured correlations between PARCC scores and later student outcomes.

Key Outcomes: The main findings from this study are as follows:

  • Students who took the online version of the Partnership for Assessment of Readiness in College and Careers (PARCC) in 2015 and 2016 fared worse than those who took the paper-and-pencil version at first, but the difference faded somewhat in the second year of online testing (Backes & Cowan, 2019).
  • There are no meaningful differences between paper and online test modes in their ability to predict later outcomes (Backes & Cowan, 2020).
  • Unlike the first years of PARCC implementation, which saw differences for online testing, there were only small differences between different test-taking modes for the new version of the Massachusetts Comprehensive Assessment System (MCAS). The lack of differences between modes indicates that it is possible to implement online assessments at scale without large online penalties (Backes & Cowan, 2020).

Structured Abstract

Setting: This study took place across the entire state of Massachusetts.

Sample: The sample included all public school students with valid test scores and demographic information from grades 3 through 8 in Massachusetts.

Factors: The researchers examined the decision to switch from paper to online testing and how this affects the measured performance of students. A concern about the use of online tests is that they may measure skills that are not the focus of the assessment. For example, an online reading assessment may intend to measure reading constructs, but because it is online, the test may also be measuring respondents' computer literacy skill. Thus, the online version of the assessment may have a penalty not associated with the off-line version. For this particular study, the researchers focused on Massachusetts assessments. During a multi-year transitional period from paper-based to online testing, Massachusetts administered the same exam in online and offline formats in each year. The design of the phase-in ensured two sources of variation in testing format. First, students in the same grade in different schools took the same test in different formats. Second, because Massachusetts transitioned different grades at different times, students in the same school in different grades first took online tests in different years.

Research Design and Methods: The research team addressed three specific research questions. The first question addressed which students were likely to score below expectation on CBTs relative to paper tests, which factors moderate or mediate this relationship, and whether these effects faded out over time as students and schools adapt to the new tests. Second, the research team evaluated the practical implications of test mode effects on accountability systems. Third, researchers looked at whether paper-based and computer-based tests had different predictive validity for important student outcomes. Because of the multi-year transitional period and the state's phase in approach, the researchers were able to estimate test mode effects by comparing the test scores of similar groups of students on CBTs versus paper tests in a given year.

Control Condition: The researchers compared the measured performance of students taking CBTs relative to observationally similar students who took the test on paper in the same year.

Key Measures: PARCC standardized test scores on online versus paper exams obtained from administrative data were the primary outcome of interest for the first two research questions. Outcomes for the third research question also included high school grade point average, attendance, and AP tests.

Data Analytic Strategy: Researchers estimated a variety of models using regression analysis.

Publications and Products

ERIC Citations: Find available citations in ERIC for this award here.

Select Publications:

Backes, B. & Cowan, J. (2020, August). Is Online a Better Baseline? Comparing the Predictive Validity of Computer- and Paper-Based Tests. CALDER Working Paper No. 241–0820.

Backes, B. & Cowan, J. (2019). Is the pen mightier than the keyboard? The effect of online testing on measured student achievement. Economics of Education Review, 68, 89–103.


Back