The impact sample is a direct result of the lotteries and the critical component of the legislatively required rigorous evaluation of the OSP. Impact evaluations compare the outcomes for a group of study participants, some of whom were randomly awarded access to the intervention (e.g., an OSP scholarship), and some of whom were randomly assigned to not receive access. The lotteries conducted for the OSP cohorts in years 1 and 2 satisfy these requirements. Since the intervention under consideration is an Opportunity Scholarship to attend a private school, the impact analysis should focus on the population of applicants for whom private schooling represented a new opportunity. Thus, the impact sample for this evaluation will comprise all eligible applicants who were previously attending public schools (or were rising kindergartners) AND were subject to a lottery to determine whether they would receive an Opportunity Scholarship (figure 3-2, shaded).15
Overall
The impact sample group has the following characteristics:
The more than 2,300 students is a large study sample relative to the impact samples used in other evaluations of private school voucher programs (803 to 1,960 students)17. Statistical computations based on reasonable assumptions about study response rates indicate that the impact sample will be sufficient to detect Program impacts of at least a moderate and educationally meaningful size. 18
Treatment vs. Control Group Differences
An important strength of experimental methods of analysis is that the assignment of study participants to the treatment and control groups creates two analytic groups that are statistically similar. The treatment, in this case the offer of an Opportunity Scholarship, is provided to one group, and any subsequent differences in outcomes observed between the groups can be ascribed to the impact of that treatment. To ensure those conditions for the impact analysis, researchers must compare the characteristics of the treatment and control groups prior to the Program (at baseline) to see if the random assignment worked. Chance alone will occasionally generate baseline differences between the treatment and control groups about 1 time out of 20, but a properly executed lottery should produce analytic groups that are similar in almost all respects.
Analysis of the OSP groups at baseline suggests that a strong foundation for the impact analysis has been laid, although some statistical procedures will be used to further equate the groups. The lotteries were conducted within grade bands, so any comparisons between the treatment and control group need to be made within such groupings. Those comparisons for cohort 1 determined that the first-year lotteries worked as designed in producing statistically similar analytic groups.19 However, for cohort 2, the treatment and control groups within grade-bands differ from each other to a statistically meaningful extent in 2 of 15 comparisons (table 3-5). For students entering grades K–5, the average family income of members of the control group is $1,287 higher than that of the treatment group, and the average years of their mother's education is also slightly higher among elementary students randomly assigned to the control group. The treatment and control groups within the K–5 grade-band are statistically similar regarding race, ethnicity, and gender, and the analytic groups within the junior high and high school grade bands are indistinguishable on all factors measured.
The presence of 2 statistically significant differences out of 15 comparisons between the treatment and control groups is most likely due to random chance. In year 2, multiple, small-scale lotteries were conducted for each grade band, and the odds of getting differences between groups with random assignment increases when the samples are small.20 In any case, because we have nearly complete baseline measures for student background factors, we can control for any measurable post-lottery differences between the analytic groups in the course of estimating subsequent Program impact.
15 The subgroups of eligible applicants to the Program who do not fit the criteria for the impact sample include eligible applicants in cohorts 1 and 2 who were already attending private schools (n=888) and two groups of public school applicants in cohort 1 who were all automatically awarded scholarships (n=851), specifically those from SINI public schools because of their high service priority and those applying for grades K–5 because there were sufficient private school slots in those grades to accommodate all of those applicants that year.
16 A total of five members (2.5 percent) of the cohort 1 randomized control group were awarded scholarships by lottery in the summer of 2005 as part of the control group follow-up lottery to reward control group members who cooperate with the evaluation testing requirements. Additional details regarding the follow-up control group lotteries are provided in Appendix A.
17 William G. Howell and Paul E. Peterson, with Patrick J. Wolf and David E. Campbell, The Education Gap: Vouchers and Urban Schools (Washington: Brookings, 2002), 44.
18 According to our power analysis in the first year report, an initial impact sample of 2,201 students should be sufficient to detect even rather modest test score impacts of 0.15 standard deviations both after 1 year and after 3 years of Program operation. See Patrick Wolf, Babette Gutmann, Nada Eissa, Michael Puma, and Marsha Silverberg, Evaluation of the DC Opportunity Scholarship Program: First Year Report on Participation (Washington, DC: U.S. Government Printing Office, 2005), A-4. To place this estimated effect size in context, an effect of 0.15 equates to a Normal Curve Equivalent (NCE) difference of 3.15 NCE points, since one standard deviation on the SAT-9 is 21/06 NCEs. Converting NCEs to a change in percentile ranks depends on where on the overall distribution the observed change occurs. For example, if the control group was, on average, at the 20th percentile, a gain of 3.15 NCEs would bring it up to about the 24th percentile. Such a gain is likely to be considered modest but educationally meaningful, and the capability to detect even a modest educational change is a clear strength of the impact evaluation going forward.
19 Wolf et al., Evaluation of the DC Opportunity Scholarship Program, 41-43.
20 To enable more parents to begin their school search early, within each grade band there was an early lottery for students deemed eligible by mid-spring and a late lottery for students not confirmed eligible until early summer. Students entering the same grade and with the same priority characteristics were assigned the exact same probability of winning a scholarship regardless of which scholarship lottery they entered.