Technical Methods Report: Do Typical RCTs of Education Interventions Have Sufficient Statistical Power for Linking Impacts on Teacher Practice and Student Achievement Outcomes? - Chapter 5: Empirical Analysis - Identifying Plausible R<sup>2</sup> Values

Technical Methods Report: Do Typical RCTs of Education Interventions Have Sufficient Statistical Power for Linking Impacts on Teacher Practice and Student Achievement Outcomes?

NCEE 2009-4065
October 2009

Chapter 5: Empirical Analysis - Identifying Plausible R² Values

To obtain benchmark R² values for the analysis, it is convenient to use estimates found in the literature on the proportion of the total variance in student gain scores that is due to classroom-level variation in gain scores—the ρ₁ and ρ₂ parameters from above (and the ICC parameters in Figure 3.1). As discussed, these ICCs are likely to provide an upper bound on the extent to which classroom-level mediators can explain the variation in student gain scores.

Chiang (2009) presents a host of ICCestimates from the literature and using new data sources. The estimates pertain to fall-spring test score gains on various math, reading, and language arts tests for elementary school students. Most studies were performed in low-income schools, but not all.

The ICCs in Chiang (2009) vary across studies, reflecting differences in study samples and achievement tests. The ICCs at the classroom level range from 0.02 to 0.15, and the ICCs at the school-level range from 0.05 to 0.20. Using mean values of ρ₁ =0.05 and ρ₂ =0.10 , it appears that overall, about 15 percent of the variance in student gain scores can be explained by differences in classroom effects within and between schools.

A measured mediator can be expected to capture only particular dimensions of teacher practices, and thus, to explain only a fraction of the 15 percent variation in classroom effects within and between schools (this fraction is denoted by R_CE,M², in Figure 3.1). For example, Jacob and Lefgren (2005) found that principal assessments of teachers explained only about 10 percent of the variation in classroom effects on reading and math. Similarly, Aaronson et al. (2007) found that a host of teacher characteristics—including age, gender, race, educational background, tenure, and total experience—together only explained about 20 percent of the variation in classroom effects. Thus, it is likely that even a strong predictor of classroom effects could explain only a portion of this variation. Furthermore, mediator subscales, that can help determine which practices matter, may explain even less.

Based on this literature, the power calculations were conducted assuming that the mediator explains 10 percent of the 15 percent variation in classroom effects (that is, R_CE,M² =.10 in Figure 3.1). This implies a benchmark R² value of 1.5 percent for the mediator effect γ₁ (which can be obtained using the relation R_y,M², =ICC* R_CE,M², =.15*.10 in Figure 3.1). The calculations were also conducted using a more optimistic R² value of 3 percent (R_CE,M² = .20 ), and a less optimistic R² value of .75 percent(R_CE,M² = .05 ). Similarly, using values of ρ₁ =0.05 and ρ₂ = 0.10 , the power calculations assumed target R² values of 0.005, 0.01, and 0.0025 for the analysis of mediator effects within schools (γ_1W), and 0.01, 0.02, and 0.005 for the analysis of mediator effects between schools (γ_1B).

Finally, viewing these target R² values as squared correlations suggests also that they are nontrivial. For instance, the assumption that the mediator can explain 10 percent of the variance in estimated classroom effects implies a correlation of 0.32 between these two measures. Similarly, the assumption that the mediator can explain 20 percent of the variance in estimated classroom effects implies a correlation of 0.45, which is larger than those that are typically found in practice (Perez-Johnson et al. 2009).

Top

Chapter 5: Empirical Analysis - Identifying Plausible R2 Values

Chapter 5: Empirical Analysis - Identifying Plausible R² Values