Evaluation of the DC Opportunity Scholarship Program: – Impact of the Program After 3 Years: Key Outcomes

Evaluation of the DC Opportunity Scholarship Program:Impacts After Three Years

NCEE 2009-4050
March 2009

Executive Summary
- DC Opportunity Scholarship Program
- Mandated Evaluation of the OSP
- Participation in the OSP
- Impact of the Program After 3 Years: Key Outcomes
- The Impact of the Program on Intermediate Outcomes
List of Figures
List of Tables
PDF & Related Info

Impact of the Program After 3 Years: Key Outcomes

The statute that authorized the OSP mandated that the Program be evaluated with regard to its impact on student test scores and school safety, as well as the "success" of the Program, which, in the design of this study, includes satisfaction with school choices. The impacts of the Program on these outcomes are presented in two ways: (1) the impact of the offer of an OSP scholarship, derived straight from comparing outcomes of the treatment and control groups, and (2) the impact of using an OSP scholarship, calculated from the unbiased treatment-control group comparison, but statistically adjusting for students who declined to use their scholarships.⁶ The main focus of this study was on the overall group of students, with a secondary interest in students who applied from SINI schools, followed by other subgroups of students (e.g., defined by their academic performance at application, their gender, or their grade level).

Previous reports released in spring 2007 and spring 2008 indicated that 1 and 2 years after application, there were no statistically significant impacts on overall academic achievement or on student perceptions of school safety or satisfaction (Wolf et al. 2007; Wolf et al. 2008). Parents were more satisfied if their child was in the Program and viewed their child's school as safer and more orderly. Among the secondary analyses of subgroups, there were impacts on math test scores in year 1 for students who applied from non-SINI schools and those with relatively higher pre-Program test scores, and impacts in reading test scores (but not math) in year 2 for those same two subgroups plus students who applied in the first year of Program implementation. However, these findings were no longer statistically significant when subjected to a reliability test to adjust for the multiple comparisons of treatment and control group students across 10 subgroups; the results may be "false discoveries" and should therefore be interpreted and used with caution. Throughout this report, the phrases "appears to have an impact" and "may have had an impact" are used to caution readers regarding statistically significant impacts that may have been false discoveries.

The analyses in this report were conducted using data collected on students 3 years after they applied to the OSP.⁷

Impacts on Students and Parents Overall

Across the full sample, there was a statistically significant impact on reading achievement of 4.5 scale score points (effect size (ES) = .13)⁸ from the offer of a scholarship and 5.3 scale score points (ES = .15) from the use of a scholarship (table 3). These impacts are equivalent to 3.1 and 3.7 months of additional learning, respectively.⁹
There was no statistically significant impact on math achievement, overall (ES = .03) from the offer of a scholarship nor from the use of a scholarship (table 3).¹⁰
Parents of students offered a scholarship were more likely to report their child's school to be safer and have a more orderly school climate (ES = .29) compared to parents of students not offered a scholarship (figure 3); the same was true for parents of students who chose to use their scholarships (ES = .34).
On the other hand, students who were offered a scholarship reported similar levels of school safety and an orderly climate compared to those in the control group (ES = .06; figure 3); there was also no significant impact on student reports of school safety and an orderly climate from using a scholarship (ES = .07).
The Program produced a positive impact on parent satisfaction with their child's school as measured by the likelihood of grading the school an "A" or "B," both for the impact of a scholarship offer (ES = .22; figure 4) and the impact of scholarship use (ES = .26).

Impacts on Subgroups

In addition to determining the general impacts of the OSP on all study participants, this evaluation also reports programmatic impacts on policy-relevant subgroups of students. The subgroups were designated prior to data collection and include students who were attending SINI versus non-SINI schools at application, those relatively higher or lower performing at baseline, girls or boys, elementary versus high school students, and those from application cohort 1 or cohort 2. Since the subgroup analysis involves significance tests across multiple comparisons of treatment and control students, some of which may be statistically significant merely by chance, these subgroup-specific results should be interpreted with caution. Specifically:

Subgroup Achievement Impacts

There were no statistically significant reading (ES = .05) or math (ES = .01) achievement impacts for the high-priority subgroup of students who had attended a SINI public school under No Child Left Behind (NCLB) before applying to the Program.
There were statistically significant impacts on reading test scores in year 3 for five subgroups of students, although the statistical significance of two of the subgroup findings was not robust to adjustments for multiple comparisons:

Students who attended non-SINI public schools prior to application to the Program (56 percent of the impact sample) scored an average of 6.6 scale score points higher in reading (ES = .19) if they were offered the scholarship compared to not being offered a scholarship and 7.7 scale score points higher (ES = .22) if they used their scholarship compared to not being offered a scholarship. These scale score differences between the treatment and control groups translate into 4.1 and 4.9 additional months of learning, or half a year of schooling based on a typical 9-month school year.
Students who entered the Program in the higher two-thirds of the test-score performance distribution at baseline (66 percent of the impact sample) scored an average of 5.5 scale score points higher in reading (ES = .17) if they were offered a scholarship and 6.2 scale score points higher (ES = .19) if they used their scholarship, impacts equivalent to 4.0 and 4.6 months of learning gains.
Female students scored an average of 5.1 scale score points higher in reading (ES = .15) if they were offered a scholarship and 5.8 scale score points higher (ES = .17) if they used their scholarship. These impacts represent 3.1 and 3.6 months of additional learning, respectively. The statistical significance of this finding was not robust to adjustments for multiple comparisons.
Students who entered the Program in grades K-8 (81 percent of the impact sample) scored an average of 5.2 scale score points higher in reading (ES = .15) or 2.9 months of additional learning if they were offered a scholarship compared to not being offered a scholarship and 6.0 scale score points higher (ES = .17) or 3.3 months of additional learning if they used their scholarship compared to not being offered a scholarship.
Students from the first cohort of applicants (21 percent of the impact sample) scored an average of 8.7 scale score points higher in reading (ES = .31) if they were offered a scholarship compared to not being offered a scholarship and 11.7 scale score points higher (ES = .42) if they used their scholarship compared to not being offered a scholarship. These impacts translate into 14.1 and 18.9 months of additional learning (1.5 to 2 years of typical schooling). The statistical significance of this finding was not robust to adjustments for multiple comparisons.

The OSP had no statistically significant reading impacts for other subgroups of participating students, including those in the lower third of the test-score performance distribution at baseline, boys, secondary students, and students from the second cohort of applicants (ES ranging from -.00 to .11).
The OSP had no statistically significant math impacts for any of the 10 subgroups (ES ranging from -.16 to .23).

Subgroup Safety and Satisfaction Impacts

All of the 10 subgroups analyzed, including parents of the high-priority subgroup of students who had attended SINI schools at baseline, reported viewing their child's school as safer and more orderly if the child was offered or using an OSP scholarship compared to not being offered a scholarship. Effect sizes for the impact of an offer of a scholarship on parent perceptions of safety and an orderly school climate for the 10 subgroups ranged from .27 to .40. Adjustments for multiple comparisons indicate that these 10 subgroup impacts on parental perceptions of safety and school climate are not likely to be false discoveries.
Consistent with the finding for students overall, none of the subgroups of students reported experiencing differences in safety and an orderly school climate if they were offered (ES range from -.03 to .08) or using an OSP scholarship.
In addition to an overall impact on parental satisfaction with their child's school, the Program produced satisfaction impacts on 7 of the 10 subgroups analyzed. Effect sizes for the impact of an offer of a scholarship on the likelihood of a parent grading his/her child's school "A" or "B" for these seven subgroups ranged from .16 to .41. Adjustments for multiple comparisons indicated that none of these parent satisfaction subgroup impacts may have been a false discovery. The parents of students who had attended SINI schools, parents of students in the lower one-third of the test score distribution, and parents of high school students generally did not report higher levels of school satisfaction that were statistically significant as a result of the treatment (ES ranged from -.03 to .13).
There were no statistically significant differences between the treatment group and the control group for all 10 subgroups in the likelihood that students gave their school a grade of A or B (ES ranged from -.18 to .05).

Top

⁶ This analysis uses straightforward statistical adjustments to account not only for the approximately 14 percent of impact sample year 3 respondents who received the offer of a scholarship but declined to use it over the 3-year period after application (the "never users"), but also the estimated 1.6 percent of the control group who never received a scholarship offer but who, by virtue of having a sibling with an OSP scholarship, ended up in a participating private school (we call this "program-enabled crossover"). These adjustments increase the size of the scholarship offer effect estimates, but do not alter the statistical significance of the impact estimate.
⁷ Specifically, year 3 test scores were obtained from 69 percent of study participants, whereas parent survey data were gathered from 68 percent of participants and student survey data from 67 percent of participants. Response rates to the principal survey varied between 51.8 percent and 57.3 percent, depending on academic year and school sector. Missing outcome data create the potential for nonresponse bias in a longitudinal evaluation such as this one, if the nonrespondent portions of the sample are different between the treatment and control groups. Response rates differed by less than 2 percent between the treatment and control groups for the tests and parent and student surveys, meaning that similar proportions of the treatment and control groups provided outcome data. In addition, nonresponse weights were used to equate the two groups on important baseline characteristics, thereby reducing the threat of nonresponse bias in this case.
⁸ An effect size (ES) is a standardized measure of the relative size of a program impact. In this report, effect sizes are expressed as a proportion of a standard deviation of the distribution of values observed for the study control group. One full standard deviation above and below the average value for a variable such as outcome test scores contains 64 percent of the observations in the distribution. Two full standard deviations above and below the average contain 95 percent of the observations.
⁹ Scale score impacts were converted to approximate months of learning first by dividing the impact ES by the ES of the weighted (by grade) average annual increase in reading scale scores for the control group. The result was the proportion of a typical year of achievement gain represented by the programmatic impact. That number was further divided by nine to convert the magnitude of the gain to months, since the official school year in the District of Columbia comprises 9 months of instruction.
¹⁰ The magnitudes of these estimated achievement effects are below the threshold of .12 standard deviations, estimated by the power analysis to be the study's Minimum Detectable Effect (MDE) size.