- Chapter 1: Introduction
- Chapter 2: Measuring Statistical Power
- Chapter 3: Considered Designs
- Chapter 4: Aggregated Designs: RD Design Theory and Design
- Chapter 5: Multilevel RD Designs
- Chapter 6: Selecting the Score Range for the Sample
- Chapter 7: Illustrative Precision Calculations
- Chapter 8: Summary and Conclusions
- References
- List of Tables
- List of Figures
- Appendix A
- Appendix B
- PDF & Related Info

In this section, I collate formulas from above and use key design parameter values from the literature to obtain illustrative MDE calculations for RD designs in the education field. The focus is on standardized test scores of elementary school and preschool students in low-performing schools. MDEs are calculated for each design considered above (where I use the multilevel versions of Designs II and III).

Tables 7.1 and 7.2
display, under various assumptions and for each of the considered RD designs, the
*total number of schools* that are required to achieve precision targets
of 0.20, 0.25, and 0.33 of a standard deviation, respectively. These are benchmarks
that are typically used in impact evaluations of educational interventions that
balance statistical rigor and study costs (Schochet 2008; Hill et al. 2007). In
Table 7.1, it is assumed that the score cutoff
is at the center of the score distribution and that the treatment and control group
samples are balanced. In Table 7.2, it is assumed
that the cutoff is at a tertile of the score distribution and that there is a 2:1
split of the research samples. Table 7.3 presents
comparable figures to those in Table 7.1 for
the RA design.

Because the amount and quality of baseline data vary across evaluations, the power
calculations are conducted assuming *R ^{2}* values of 0, 0.20, 0.50,
and 0.70 at each group level. The

To keep the presentation manageable, RD design effects are presented assuming that scores are normally distributed. As discussed, for a given treatment-control sample split, the RD design effect does not vary much according to the score distribution or location of the cutoff score. Thus, the results that are presented are broadly applicable, but could easily be revised using the alternative score distributions or parameter values that were discussed above.

The estimates also assume:

- A two-tailed test at 80 percent power and a 5 percent significance level
- The intervention is being tested in a single grade with an average of 3 classrooms per school per grade and an average of 23 students per classroom. Thus, the sample contains 69 students per school.
- 80 percent of students in the sample will provide follow-up (posttest) data, so that posttest data are available for about 55 students per school.
- ICC values of 0.15 at the school and classroom levels (which are consistent with the empirical findings in Schochet 2008, Hedges and Hedberg 2007, and Bloom et al. 2005a)
- An ICC value of 0.15 pertaining to the variance of treatment effects across schools in Designs IV and V (Schochet 2008)
- A sharp RD design rather than a fuzzy RD design (that is, that all units comply with their treatment assignments)

The key results can be summarized as follows:

Consider the most commonly-used design in education-related impact studies where equal numbers of schools are assigned to treatment or control status. Under this design, about 114 total schools (57 treatment and 57 control) are required to yield an MDE of 0.25 standard deviations, assuming a regression**Much larger sample sizes are typically required under RD than RA designs.***R*value of 0.5 (Design III; Table 7.1). The corresponding figure for the RA design is only 42 total schools (Table 7.3). Similarly, for the classroombased Design II, the required number of schools is 45 for the RD design (Table 7.1), compared to only 16 for the RA design (Table 7.3).^{2}Under Design III, 66 schools (33 treatment and 33 control) are required to achieve an MDE of 0.33 standard deviations (assuming an**Because of resource constraints, school-based RD designs may only be feasible for interventions that are likely to have relatively large effects (about 0.33 standard deviations or more).***R*value of 0.50; Table 7.1). This number is comparable to the number of schools that are typically included in large-scale experimental impact evaluations funded by the U.S. Department of Education.^{2}The required school sample sizes are similar in Tables 7.1 and 7.2. This occurs because as discussed, a balanced sample allocation yields larger RD design effects than an unbalanced allocation, but also yields smaller variances under the RA design; these two effects are largely offsetting.**A 2:1 split of the sample has a small effect on statistical power.**The viability of RD designs in education research hinges critically on the availability of detailed baseline data at the aggregate school or individual student level—and in particular, pretest data—that can be used as covariates in the regression models to improve*R*values matter.^{2}*R*values. For instance, for the school-based Design III, the number of schools required to achieve an MDE of 0.33 standard deviations is 39 if the^{2}*R*value is 0.70, 66 if the^{2}*R*value is 0.50, and 131 for a zero R2 value (Table 7.1).^{2}For example, under the classroom-based Design II, 45 schools are required to achieve an MDE of 0.25 standard deviations, assuming an**RD designs may be most viable for less-clustered designs where classrooms or students are the unit of treatment assignment.***R*value of 0.50 (Table 7.1). The comparable figure for the classroom-based Design V (with random school effects) increases to only 53, because, as discussed, RD design effects are smaller for this design than for Design II. For the student-level Design I, the comparable number of required schools is 13 schools (Table 7.1).^{2}