- Chapter 1: Introduction
- Chapter 2: Measuring Statistical Power
- Chapter 3: Considered Designs
- Chapter 4: Aggregated Designs: RD Design Theory and Design
- Chapter 5: Multilevel RD Designs
- Chapter 6: Selecting the Score Range for the Sample
- Chapter 7: Illustrative Precision Calculations
- Chapter 8: Summary and Conclusions
- References
- List of Tables
- List of Figures
- Appendix A
- Appendix B
- PDF & Related Info

In this section, results from above are generalized to multilevel designs where data are analyzed at the student rather than group level. Designs II and III are discussed first, followed by a discussion of Designs IV to VI.

The causal inference theory discussed above can be extended to the two-level design
where students are nested within units that are assigned to a research status. As
before, let *Y _{Ti}* and

Suppose that *m* students are sampled from the student superpopulation within
each study unit. Let *W _{Tij}* be the potential outcome for student

In what follows, the two-level RA and RD designs are discussed using this causal inference framework.

**The RA Design**

Under the RA design, the observed outcome for a student, *w _{ij}^{RA}*,
can be expressed as follows:

As before, terms in (18) can be rearranged to create the following regression model:

where:

- α
_{0}and α_{1}(the*ATE*parameter) are defined as above - λ =
*T*(_{i}^{RA}*Y*-μ_{Ti}_{T}) + (1-*T*)(_{i}^{RA}*Y*-μ_{Ci}_{C}) is a unit-level error term with mean zero and between-unit variance σ_{λ}^{2}that is uncorrelated with*T*_{i}^{RA} - θ =
*T*(_{i}^{RA}*W*-_{Tij}*Y*) +(1-_{Ti}*T*)(_{i}^{RA}*W*-_{Cij}*Y*) is a student-level error term with mean zero and within-unit variance θ_{Ci}_{σ}^{2}that is uncorrelated with λ_{i}and*T*_{i}^{RA}

Importantly, (19) can also be derived using the following two-level hierarchical linear (HLM) model (Bryk and Raudenbush 1992):

where Level 1 corresponds to students and Level 2 corresponds to units. Inserting the Level 2 equation into the Level 1 equation yields (19). Thus, the HLM approach is consistent with the causal inference theory presented above.

Suppose that *Score _{i}* and other unit- and student-level baseline
covariates are included in (19) as covariates. In this case, the asymptotic variance
of the two-level (

where *R _{RA_X_B}^{2}* is the asymptotic regression

The within-school variance term in (20) is the conventional variance expression for an impact estimate under a nonclustered design. Design effects in a clustered design arise because of the first variance term, which represents the correlation of the outcomes of students within the same units (Murray 1998; Donner and Klar 2000; Raudenbush 1997). Design effects can be large because the divisor in the between-unit term is the number of units rather than the number of students.

It is common to express the variance expression in (20) in terms of the *intraclass
correlation (ICC)* (Cochran 1963;
Kish 1965), which is defined as the between-unit variance (σ_{λ}^{2})
as a proportion of the total variance of the outcome measure (σ^{2}
=σ_{λ}^{2} +σ_{θ}^{2}):

In this formulation, design effects from clustering are small if the mean of the
outcome measure does not vary much across units (that is, if *ICC* is small).
In this case, the approach discussed above where student-level data are averaged
to the unit level will provide consistent, but inefficient impact estimates. On
the other hand, if the *ICC* is large (that is, close to 1), then using unit
averages or individual studentlevel data will produce impact estimates with similar
levels of precision. Specific *ICC* (and *R*^{2}) values will
depend on the design.

**The RD Design**

Results from Chapter 4 for the RD design can also be extended to the two-level model.
Let the observed outcome for a student, *w _{ij}^{RD}*, be
expressed as follows:

where the term inside the brackets is a mean zero residual term. If *Y _{Ti}*
and em>Y

where τ_{i} is a mean zero unit-level error term with variance
σ_{τ}^{2}, and δ_{ij} is a mean
zero student-level error term with variance σ_{δ}^{2} that
is uncorrelated with τ_{i}. The parameter α_{1}
is the same *ATE _{K}* parameter as in (8) above.

As with the RA design, (23) can be obtained using a two-level HLM model:

Inserting the Level 2 equation into the Level 1 equation yields (23).

In Appendix B, it is proved that the two-level OLS estimator
α̂_{1} in (23) yields a consistent estimator of the *ATE _{K}*
parameter assuming that the model is specified correctly. This result holds even
if

where *R _{RD_X_B}^{2}* and

**The RD Design Effect**

A key finding is that the RD design effect *remains* at 1/(1 -ρ_{TS}^{2})
under the two-level design. This is because the variances inside the brackets in
(24) for the RD design equal the variances inside the brackets in (20) or (21) for
the RA design. Thus, relative to the aggregated model presented above, the use of
the two-level model for Designs II and III will typically improve the precision
of the impact estimates for both the RD and RA designs. However, the proportional
improvement in precision is the *same* for each design, so that the RD design
effect does not change. Similar results about the RD design effect apply also for
the fuzzy RD design and for calculating MDEs.

Theoretical results from the models discussed above carry over directly to Design
VI, where schools are the unit of assignment and classroom effects are treated as
random. Ignoring *R*^{2} terms for simplicity, the variance expression
for the RD impact estimator for this design is:

where *ICC _{1}* is the intraclass correlation at the school level,

The situation is somewhat different for Design IV where the unit of assignment is
at the student level and site effects are treated as random. For this design, it
is assumed that treatment effects are constant within sites, but not across sites.
Instead, site-level treatment effects are assumed to be drawn from a population
distribution with variance σ_{I}^{2}.

In this case, the design effect 1/(1 -ρ_{TS}^{2})
affects the student-level variance term, but not the site-level variance term. Ignoring
*R*^{2} terms, the variance expression for the RD impact estimate under Design IV is:

Thus, the RD design effect is smaller for Design IV than for the other designs considered above.

A similar situation occurs under Design V, where the variance expression for the RD impact estimate is: