Skip Navigation
Technical Methods Report: Statistical Power for Regression Discontinuity Designs in Education Evaluations

NCEE 2008-4026
August 2008

Chapter 5: Multilevel RD Designs

In this section, results from above are generalized to multilevel designs where data are analyzed at the student rather than group level. Designs II and III are discussed first, followed by a discussion of Designs IV to VI.

Designs II and III

The causal inference theory discussed above can be extended to the two-level design where students are nested within units that are assigned to a research status. As before, let YTi and YCi be unit-level potential outcomes and Scorei be unit-level assignment scores, whose joint distributions are defined as in Chapter 4.8 The sample contains np treatment units and n(1-p) control units.

Suppose that m students are sampled from the student superpopulation within each study unit. Let WTij be the potential outcome for student j in unit i in the treatment condition and WCij be the corresponding potential outcome for the student in the control condition. It is assumed that WTij and WCij are random draws from student-level potential treatment and control outcome distributions (that are conditional on school-level potential outcomes) with means YTi and YCi, respectively, and common variance σθ2.

In what follows, the two-level RA and RD designs are discussed using this causal inference framework.

The RA Design
Under the RA design, the observed outcome for a student, wijRA, can be expressed as follows:

observed outcome for a student

As before, terms in (18) can be rearranged to create the following regression model:

regression model


  1. α0 and α1 (the ATE parameter) are defined as above
  2. λ =TiRA (YTiT) + (1-TiRA)(YCiC) is a unit-level error term with mean zero and between-unit variance σλ2 that is uncorrelated with TiRA
  3. θ =TiRA (WTij -YTi) +(1-TiRA)(WCij -YCi) is a student-level error term with mean zero and within-unit variance θσ2 that is uncorrelated with λi and TiRA

Importantly, (19) can also be derived using the following two-level hierarchical linear (HLM) model (Bryk and Raudenbush 1992):

two-level hierarchical linear (HLM) model

where Level 1 corresponds to students and Level 2 corresponds to units. Inserting the Level 2 equation into the Level 1 equation yields (19). Thus, the HLM approach is consistent with the causal inference theory presented above.

Suppose that Scorei and other unit- and student-level baseline covariates are included in (19) as covariates. In this case, the asymptotic variance of the two-level (TL) OLS estimator for the ATE parameter is as follows:

variance expression

where RRA_X_B2 is the asymptotic regression R2 value for the between-variance component and RRA_X_W2 is the asymptotic regression R2 value for the within-variance component. These two R2 values could differ depending on the nature of the covariates.

The within-school variance term in (20) is the conventional variance expression for an impact estimate under a nonclustered design. Design effects in a clustered design arise because of the first variance term, which represents the correlation of the outcomes of students within the same units (Murray 1998; Donner and Klar 2000; Raudenbush 1997). Design effects can be large because the divisor in the between-unit term is the number of units rather than the number of students.

It is common to express the variance expression in (20) in terms of the intraclass correlation (ICC) (Cochran 1963; Kish 1965), which is defined as the between-unit variance (σλ2) as a proportion of the total variance of the outcome measure (σ2λ2θ2):

total variance of the outcome measure

In this formulation, design effects from clustering are small if the mean of the outcome measure does not vary much across units (that is, if ICC is small). In this case, the approach discussed above where student-level data are averaged to the unit level will provide consistent, but inefficient impact estimates. On the other hand, if the ICC is large (that is, close to 1), then using unit averages or individual studentlevel data will produce impact estimates with similar levels of precision. Specific ICC (and R2) values will depend on the design.

The RD Design
Results from Chapter 4 for the RD design can also be extended to the two-level model. Let the observed outcome for a student, wijRD, be expressed as follows:

observed outcome for a student

where the term inside the brackets is a mean zero residual term. If YTi and em>YCi are modeled as a linear function of the assignment scores (as in [7a] and [7b]), (22) yields the following two-level RD regression model:

where τi is a mean zero unit-level error term with variance στ2, and δij is a mean zero student-level error term with variance σδ2 that is uncorrelated with τi. The parameter α1 is the same ATEK parameter as in (8) above.

As with the RA design, (23) can be obtained using a two-level HLM model:

two-level HLM model

Inserting the Level 2 equation into the Level 1 equation yields (23).

In Appendix B, it is proved that the two-level OLS estimator α̂1 in (23) yields a consistent estimator of the ATEK parameter assuming that the model is specified correctly. This result holds even if Scorei is correlated with τi. Assuming that additional baseline covariates are included in the model, the asymptotic variance of this estimator is as follows (see Appendix B):

where RRD_X_B2 and RRD_X_W2 are between- and within-unit asymptotic regression R2 values, respectively.

The RD Design Effect
A key finding is that the RD design effect remains at 1/(1 -ρTS2) under the two-level design. This is because the variances inside the brackets in (24) for the RD design equal the variances inside the brackets in (20) or (21) for the RA design. Thus, relative to the aggregated model presented above, the use of the two-level model for Designs II and III will typically improve the precision of the impact estimates for both the RD and RA designs. However, the proportional improvement in precision is the same for each design, so that the RD design effect does not change. Similar results about the RD design effect apply also for the fuzzy RD design and for calculating MDEs.

Designs IV, V, and VI

Theoretical results from the models discussed above carry over directly to Design VI, where schools are the unit of assignment and classroom effects are treated as random. Ignoring R2 terms for simplicity, the variance expression for the RD impact estimator for this design is:

where ICC1 is the intraclass correlation at the school level, ICC2 is the intraclass correlation at the classroom level, n is the number of schools, and c is the number of study classrooms per school. This expression is a product of the variance expression for the RA design and the RD design effect 1/(1 -ρTS2).

The situation is somewhat different for Design IV where the unit of assignment is at the student level and site effects are treated as random. For this design, it is assumed that treatment effects are constant within sites, but not across sites. Instead, site-level treatment effects are assumed to be drawn from a population distribution with variance σI2.

In this case, the design effect 1/(1 -ρTS2) affects the student-level variance term, but not the site-level variance term. Ignoring R2 terms, the variance expression for the RD impact estimate under Design IV is:

variance expression

Thus, the RD design effect is smaller for Design IV than for the other designs considered above.

A similar situation occurs under Design V, where the variance expression for the RD impact estimate is:

variance expression


8 For illustration simplicity, common symbols and subscripts are used for each two-level design. This convention is followed for the remainder of this section.