Skip Navigation
The Late Pretest Problem in Randomized Control Trials of Education Interventions

NCEE 2009-4033
October 2008

Chapter 5: Theoretical Framework

This chapter discusses the statistical theory underlying the variance-bias tradeoff associated with including pretests in the posttest impact models for two-level clustered RCTs. The theory is discussed in the context of the causal inference theory underlying RCTs (Neyman 1923; Rubin 1974; Holland 1986; Imbens and Rubin 2007; Freedman 2008; Schochet 2007).

It is assumed that students are nested within n units (schools or classrooms) that are randomly assigned to a single treatment or control group. The sample is assumed to contain np treatment units and n(1-p) control units, where p is the sampling rate to the treatment group (0<p<1).

This paper considers a "superpopulation" version of the Neyman-Rubin causal inference model (see Imbens and Rubin 2007; Schochet 2007; and Yang and Tsiatis 2001). Let Z1Ti be the "potential" unit-level continuous posttest score for unit i in the treatment condition and Z1Ci be the potential posttest score for unit i in the control condition. Potential posttest scores for the n study units are assumed to be random draws from potential treatment and control posttest distributions in the study population, with means μ1T and μ1C, respectively; a common variance σ12 > 0 is assumed for each research group to ensure that variance estimates based on standard ordinary least squares (OLS) methods are justified by the Neyman- Rubin causal model (Freedman 2008; Schochet 2007). It is assumed that treatment assignments are independent of potential outcomes (due to random assignment), and that potential outcomes for each unit are unrelated to the treatment status of other units. Finally, let Z0Ti, Z0Ci, μ0T, μ0C, and σ02 denote corresponding quantities for fall pretest scores, and let σ01 denote the covariance between the potential pretest and posttest scores for both the treatment and control groups (which could depend on how late the pretests are collected).1

Suppose next that m students are sampled from the student superpopulation within each study unit. Let Y1Tij be the potential posttest score for student j in unit i in the treatment condition and Y1Cij be the corresponding potential posttest score for the student in the control condition. Y1Tij and Y1Cij are assumed to be random draws from student-level potential treatment and control posttest distributions (which are conditional on school-level potential outcomes) with means Z1Ti and Z1Ci, respectively, and common variance τ12 > 0. Corresponding variables for student-level pretest scores are denoted by replacing subscripts of "1" by subscripts of "0". The covariance between student-level potential pretest and posttest scores within units is denoted by τ01.2

Under this causal inference model, the difference between the two potential posttest scores, (Z1Ti - Z1Ci), is the unit-level treatment effect for unit i, and the average treatment effect parameter (ATE) = E(Z1Ti - Z1Ci) = μ1T - μ1C. The unit-level treatment effects, and hence, the ATE parameter, cannot be calculated directly because for each unit and student, the potential outcome is observed in either the treatment or control condition, but not in both. Formally, if Ti is a treatment status indicator variable that equals 1 for treatments and 0 for controls, then the observed posttest score for a unit, z1i, can be expressed as follows:

(4) z1i = Ti Z1Ti + (1-Ti ) Z1Ci.

Similarly, the observed posttest score for a student y1ij is:

(5) y1ij = Ti Y1Tij + (1-Ti ) Y1Cij.

The simple equations in (4) and (5) form the basis for the causal inference theory presented below.

The terms in (5) can be rearranged to create the following regression model:
 
(6) y1ij = α0 + α1Ti + μ1i + e1ij), where
 
1. α0 = μ1C and α1 = μ1T - μ1C (the ATE parameter) are coefficients to be estimated
 
2. μ1i = Ti (Z1Ti - μ1T) + (1 -Ti)(Z1Ci - μ1C) is a unit-level error term with mean zero and between-unit variance σ12 that is uncorrelated with Ti
 
3. e1ij = Ti (Y1Tij - Z1Ti) + (1- Ti)(Y1Cij - Z1Ci) is a student-level error term with mean zero and within-unit variance τ12 that is uncorrelated with μ1i and Ti

Importantly, (6) can also be derived using the following two-level HLM model (Bryk and Raudenbush 1992):

Level 1: y1ij = Z1i + e1ij
Level 2: Z1i = α0 + α1Ti + ui,

where Level 1 corresponds to students and Level 2 to units. Inserting the Level 2 equation into the Level 1 equation yields (6). Thus, the HLM approach is consistent with the causal inference theory presented above.3

A similar approach can be used to develop a regression model for the observed pretest scores:

(7) Y0ij = β0 + β1Ti + (u0i + e0ij),

where β0= μ0C, β1= μ0T - μ0C, and μ0i and e0ij are between- and within-unit error terms, respectively, with the following properties: E(u0i) = E(e0ij) = 0; E(Tiu0i) = E(Tie0ij) = 0; Var(u0i) = σ02; Var(e0ij) = τ02; Cov(u0i, e0ij) = 0; Cov(u0i, u1i) = σ01; and Cov(e0ij, e1ij) = τ01.

If the pretests are "true" baselines, β1 will equal zero because of random assignment. Stated differently, with true baselines, Z0Ti = Z0Ci and Y0Tij = Y0Cij. With late baselines, the size and sign of β1 will depend on the growth trajectory of intervention effects, the overall timing of baseline testing, and differences in testing-date distributions across the treatment and control groups. For example, β1 will tend to be positive if the intervention has early beneficial intervention effects or if pretest testing dates are, on average, later for treatments than for controls.

Top

1 Neyman (1923) considered a "finite population" model where potential outcomes are assumed to be fixed for the study population and where the only source of randomness is treatment status.
2 Equal cluster sample sizes are assumed for simplicity, and because this largely holds in clustered RCT designs in the education area. The results presented in this paper apply approximately for unequal cluster sizes if m is replaced in the formulas by the average cluster size m (Kish 1965).
3 It is assumed that there are no biases due to missing posttest data. Davidian et al. (2005) discuss semiparametric estimation of treatment effects in a pretest-posttest study with missing data.