Appendix A1 Study characteristics: Study characteristics: Beck Evaluation & Testing Associates, 2005 (randomized controlled trial)
| Characteristic | Description |
|---|---|
| Study citation | Beck Evaluation & Testing Associates, Inc. (2005). Progress in Mathematics © 2006: Grade 1 pre-post field test evaluation study. New York: Sadlier-Oxford Division, William H. Sadlier, Inc. |
| Participants | The study included 186 first graders (96 students in the intervention group and 90 students in the comparison group) in eight classrooms across four schools. Within schools, classrooms were randomly assigned to the intervention or comparison group. For rating purposes, the sample for the analysis of the Terra Nova Mathematics Test included 186 students, and the sample for the analysis of the Terra Nova Math Computation Test included 181 students. |
| Setting | The eight classrooms were located in four elementary schools in four school districts in the eastern United States. Three of the schools were Catholic schools, and one was a public school. One pair of classrooms (one intervention and one comparison) was located in each of the participating schools. |
| Intervention | The intervention classrooms received the pre-publication version of Progress in Mathematics © 2006 student edition materials, student workbooks, and teacher guides. The study indicated that those materials resembled as closely as possible the intended published version. |
| Comparison | The comparison classrooms used the 2000 version of Progress in Mathematics. This textbook series had been used in the participating schools for at least three years prior to the study. This intervention report regards Progress in Mathematics © 2006 as a different math program from Progress in Mathematics © 2000. The WWC team compared the textbooks of both programs and found them to differ extensively in terms of content, assessment materials, organization, and presentation. Whereas the 2000 version emphasizes written computation skills, the © 2006 version focuses on mathematical language and problem solving in addition to computation. Information received from the developer confirmed this difference between programs. |
| Primary outcomes and measurement | Students were tested using the TerraNova Mathematics and Math Computation Tests (see Appendix A2 for more detailed descriptions of outcome measures).1 |
| Teacher training | Intervention group teachers received a pre-implementation orientation from the developer's editorial staff. They also received ongoing editorial department support through in-person visits and by phone throughout the study.2 The comparison group teachers already had previous training and experience with their current textbooks. |
|
1 The study reported on student outcomes using an additional outcome measure, the Custom Test, which did not meet WWC evidence screens because of differential attrition of students in the intervention and comparison groups. |
|
Appendix A2 Outcome measures in the math achievement domain
| Outcome measure | Description |
|---|---|
| TerraNova Mathematics Test | Level 11, Form C of the CAT TerraNova series, second edition (McGraw-Hill, 2001; as cited in Beck Evaluation and Testing Associates, 2005) is a standardized nationally normed test. This mathematics test includes 47 items and is part of the CAT Basic Battery. |
| TerraNova Mathematics Computation Test | Level 11, Form C of the CAT TerraNova series, second edition (McGraw-Hill, 2001; as cited in Beck Evaluation and Testing Associates, 2005) is a standardized nationally normed test. This test includes 20 mathematics computation items and is part of the CAT Plus portion of the Terra Nova series. |
Appendix A3 Summary of study findings included in the rating for the math achievement domain1
| Author's findings from the study | ||||||||
|---|---|---|---|---|---|---|---|---|
| Mean outcome (standard deviation2) | WWC calculations | |||||||
| Outcome measure | Study sample | Sample size (classrooms/students) | Progress in Mathematics group3 | Comparison group | Mean difference4 (Progress in Mathematics – comparison) | Effect size5 | Statistical significance6 (at α= 0.05) | Improvement index7 |
| Beck Evaluation & Testing Associates, 2005 (randomized controlled trial)8 | ||||||||
| TerraNova Mathematics Test | Grade 1 | 8/186 | 40.62 (4.30) | 37.70 (5.80) | 2.92 | 0.57 | ns | +22 |
| TerraNova Mathematics Computation Test | Grade 1 | 8/181 | 15.50 (2.70) | 16.80 (3.30) | -1.30 | -0.43 | ns | -17 |
| Domain average9 for math achievement (Beck Evaluation & Testing Associates, 2005) | 0.07 | ns | +3 | |||||
|
ns = not statistically significant 1 This appendix reports findings considered for the effectiveness rating and the average improvement indices. Subtest findings from the same study are not included in these ratings, but are reported in Appendix A4.2 The standard deviation across all students in each group shows how dispersed the participants' outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes. 3 The intervention group mean equals the comparison group mean plus the mean difference. 4 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. The mean difference and effect size were calculated using the difference in difference approach, which takes baseline student scores into account. 5 For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations. 6 Statistical significance is the probability that the difference between the groups is a result of chance rather than a real difference between the groups. 7 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between -50 and +50, with positive numbers denoting results favorable to the intervention group. 8 The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. See Technical Details of WWC-Conducted Computations for the formulas the WWC used to calculate statistical significance. In the case of Beck Evaluations & Testing Associates (2005), corrections for clustering and multiple comparisons were needed, so the significance levels differ from those reported in the original study. 9 This row provides the study average, which in this instance is also the domain average. The WWC-computed domain average effect size is a simple average rounded to two decimal places. The domain improvement index is calculated from the average effect size. |
||||||||
Appendix A4 Progress in Mathematics © 2006 rating for the math achievement domain
The WWC rates an intervention's effects in a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1
For the outcome domain of math achievement, the WWC rated Progress in Mathematics © 2006 as having no discernible effects. It did not meet the criteria for other ratings (positive effects, potentially positive effects, mixed effects, potentially negative effects, and negative effects) because the single study that met WWC standards did not show statistically significant or substantively important effects.
| Rating received |
|---|
|
No discernible effects: No affirmative evidence of effects.
|
| Other ratings considered |
|
Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.
|
|
Potentially Positive effects: Evidence of a positive effect with no overriding contrary evidence.
|
|
Mixed effects: Evidence of inconsistent effects as demonstrated through either of the following criteria.
|
|
Potentially negative effects: Evidence of a negative effect with no overriding contrary evidence.
|
|
Negative effects: Strong evidence of a negative effect with no overriding contrary evidence.
|
|
1 For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. See the WWC Intervention Rating Scheme for a complete description. |
Appendix A5 Extent of evidence by domain
| Sample size | ||||
|---|---|---|---|---|
| Outcome domain | Number of studies | Schools | Studies | Extent of evidence1 |
| Math achievement | 1 | 4 | 181 | Small |
|
na = not applicable/not studied 1 A rating of "medium to large" requires at least two studies and two schools across studies in one domain, and a total sample size across studies of at least 350 students or 14 classrooms. Otherwise, the rating is "small." |
||||
|Institute of Education Sciences