National Assessment of Title I - Interim Report to Congress

National Assessment of Title I - Interim Report to Congress
Executive Summary

NCEE 2006-4000
June 2006

F. Discussion of Impacts

This first year report assesses the impact of the four interventions on the treatment groups in comparison with the control groups immediately after the end of the reading interventions. In particular, we provide detailed estimates of the impacts, including the impact of being randomly assigned to receive any of the interventions, being randomly assigned to receive a word-level intervention, and being randomly assigned to receive each of the individual interventions. For purposes of this summary, we focus on the impact of being randomly assigned to receive any intervention compared to receiving the instruction that would normally be provided. These findings are the most robust because of the larger sample sizes. The full report also estimates impacts for various subgroups, including students with weak and strong initial word attack skills, students with low or high beginning vocabulary scores, and students who either qualified or did not qualify for free or reduced price school lunches.²

The impact of each of the four interventions is the difference between average treatment and control group outcomes. Because students were randomly assigned to the two groups, we would expect the groups to be statistically equivalent; thus, with a high probability, any differences in outcomes can be attributed to the interventions. Also because of random assignment, the outcomes themselves can be defined either as test scores at the end of the school year, or as the change in test scores between the beginning and end of the school year (the "gain"). In the tables of impacts (Tables 2-4), we show three types of numbers. The baseline score shows the average standard score for students at the beginning of the school year. The control gain indicates the improvement that students would have made in the absence of the interventions. Finally, the impact shows the value added by the interventions. In other words, the impact is the amount that the interventions increased students' test scores relative to the control group. The gain in the intervention group students' average test scores between the beginning and end of the school year can be calculated by adding the control group gain and the impact.

In practice, impacts were estimated using a hierarchical linear model that included a student-level model and a school-level model. In the student-level model, we include indicators for treatment status and grade level as well as the baseline test score. The baseline test score was included to increase the precision with which we measured the impact, that is, to reduce the standard error of the estimated impact. The school-level model included indicators that show the intervention to which each school was randomly assigned and indicators for the blocking strata used in the random assignment of schools to interventions. Below, we describe some of the key interim findings:

For third graders, we found that the four interventions combined had impacts on phonemic decoding, word reading accuracy and fluency, and reading comprehension. There are fewer significant impacts for fifth graders than for third graders (see Table 2). The impacts of the three word-level interventions combined were similar to those for all four interventions combined. Although many of the impacts shown in Table 2 for third graders are positive and statistically significant when all, or just the three word-level, interventions are considered, it is noteworthy that on the GRADE, which is a group-administered test for reading comprehension, the impact estimate and the estimated change in standard scores for the control group indicate that there was not a substantial improvement in reading comprehension in the intervention groups relative to the larger normative sample for the test. Instead, this evidence suggests that the interventions helped these students maintain their relative position among all students and not lose ground in reading comprehension, as measured by the GRADE test. Results from the GRADE test are particularly important, because this test, more than others in the battery, closely mimics the kinds of testing demands (group administration, responding to multiple choice comprehension questions) found in current state-administered reading accountability measures.
Among key subgroups, the most notable variability in findings was observed for students who qualified for free or reduced price lunches and those who did not. Although the ability to compare impacts between groups is limited by the relatively small samples, we did generally find significant impacts on the reading outcomes for third graders who did not qualify and few significant impacts for those who did qualify (see Tables 3 and 4), when all four interventions are considered together and when the three word-level interventions are considered together. These findings for third graders may be driven in part by particularly large negative gains among the control group students in the schools assigned to one intervention.
At the end of the first year, the reading gap for students in the intervention group was generally smaller than the gap for students in the control group when considering all four interventions together. The reading gap describes the extent to which the average student in one of the two evaluation groups (intervention or control) is lagging behind the average student in the population (see Figures 1-12 and Table 5). The reduction in the reading gap attributable to the interventions at the end of the school year is measured by the interventions' impact relative to the gap for the control group, the latter showing how well students would have performed if they had not been in one of the interventions. Being in one of the interventions reduced the reading gap on Word Attack skills by about two-thirds for third graders. On other word-level tests and a measure of reading comprehension, the interventions reduced the gap for third graders by about one-fifth to one-quarter. For fifth graders, the interventions reduced the gap for Word Attack and Sight Word Efficiency by about 60 and 12 percent, respectively.³

Future reports will focus on the impacts of the interventions one year after they ended. At this point, it is still too early to draw definitive conclusions about the impact of the interventions assessed in this study. Based on the results from earlier research (Torgesen et al. 2001), there is a reasonable possibility that students who substantially improved their phonemic decoding skills will continue to improve in reading comprehension relative to average readers. Consistent with the overall pattern of immediate impacts, we would expect more improvement in students who were third graders when they received the intervention relative to fifth graders. We are currently processing second-year data (which includes scores on the Pennsylvania state assessments) and expect to release a report on that analysis within the next year.

Top

² The impacts described here represent the impact of being selected to participate in one of the interventions. A small number of students selected for the interventions did not participate, and about 7.5 percent received less than a full dose (80 hours) of instruction. Estimation of the effect of an intervention on participants and those who participated for 80 or more hours requires that stronger assumptions be made than when estimating impacts for those offered the opportunity to participate, and we cannot have the same confidence in the findings as we do with the results discussed in this summary. Our full report presents estimates of the effects for participants and those who participated for at least 80 hours. These findings are similar to those reported here.
³ In future analyses, we plan to explore another approach for estimating the impact of the interventions on closing the reading gap. This approach will contrast the percentage of students in the intervention groups and the control groups who scored within the "normal range" on the standardized tests.