Beginning Reading - Methodology
Skip Navigation

What Works Clearinghouse


Beginning Reading
Beginning Reading
July 16, 2007

Methodology

Eight hundred eighty-seven studies provided data on 153 programs and were classified according to the strength of their design. To be fully reviewed, a study had to be a randomized controlled trial or quasi-experimental design. 1

Eligibility for review

Quasi-experiments eligible for review include those equating through matching or statistical adjustment; regression discontinuity and single case designs are also included. No studies based on the regression discontinuity designs were identified for the beginning reading review; several single case designs were identified. The WWC is currently developing evidence standards for regression discontinuity designs and single-case designs.

The review considered the properties of measurement instruments, the percentage of students, classrooms, or schools in the study sample that were not included in the reported results, and any sample characteristics or events that might serve as alternative explanations for the observed effect. For details please see the WWC Evidence Standards.

The research evidence for programs that have at least one study meeting WWC evidence standards with or without reservations is summarized in individual intervention reports posted on the WWC website. See http://www.whatworks.ed.gov. So far, 51 studies of 24 beginning reading programs have met evidence standards with or without reservations. The lack of evidence for the remaining programs does not mean that those programs are ineffective. Some programs have not yet been studied using a study design that permits the WWC to draw any conclusions about their effectiveness. For some studies, not enough data were reported (such as descriptive statistics of the findings) to enable us to confirm statistical findings.

Rating of effectiveness

Among the prioritized interventions, each beginning reading program that had at least one study meeting WWC standards with or without reservations received a rating of effectiveness for beginning reading achievement. The rating of effectiveness aims to characterize the existing evidence base in a given domain. The intervention effects based on the research evidence can be rated as having positive, potentially positive, mixed, no discernible, potentially negative, or negative effects.

The rating of effectiveness takes into account four factors: the quality of the research design; the statistical significance of the findings; the size of the difference between participants in the intervention and comparison conditions; and the consistency in findings across the studies (see the WWC Intervention Rating Scheme).

The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. Because of these corrections, the level of statistical significance as calculated by the WWC may differ from the one originally reported by the study authors. For an explanation, see the WWC Tutorial on Mismatch. For the formulas that we used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. If the average effect size across all outcome measures in one study in a single domain is at least 0.25, it is considered substantively important, contributing toward the rating of effectiveness. See the technical appendices of the beginning reading intervention reports for further details.

Extent of evidence

The WWC categorizes the extent of evidence in each domain as small or medium to large (see the What Works Clearinghouse Extent of Evidence Categorization Scheme). The extent of evidence takes into account the number of studies and the total sample size across the studies that met WWC evidence standards with or without reservations. 2

Improvement Index

The WWC computes an improvement index for each individual finding. In addition, within each outcome domain, the WWC computes an average improvement index for each domain and each study as well as a domain average improvement index across studies of the same intervention (see the Technical Details of WWC-Conducted Computations). The improvement index represents the difference between the percentile rank of the average student in the intervention condition and the percentile rank of the average student in the comparison condition. The improvement index can take on values between -50 and +50, with positive numbers denoting results favorable to the intervention group. Unlike the rating of effectiveness, the improvement index is based only on the size of the difference between the intervention and the comparison conditions.

1Thirty-two interventions (involving 36 quasi-experimental design studies) passed the initial screening criteria but were not included in this wave of Beginning Reading reviews. These interventions were those that on initial screening had only one eligible study that met WWC evidence standards with reservations (i.e., had the fewest numbers of studies, which also used less rigorous designs). Seven additional single-case studies have dispositions pending. The WWC is currently developing standards for the review of single case studies.
2The Extent of Evidence Categorization was developed to tell readers how much evidence was used to determine the intervention rating, focusing on the number and size of studies. Additional factors associated with a related concept, external validity—such as the students' demographics and the types of settings in which studies took place—are not taken into account in the categorization.

Top


PO Box 2393
Princeton, NJ 08543-2393
Phone: 1-866-503-6114