Skip Navigation
Achievement Effects of Four Early Elementary School Math Curricula

NCEE 2009-4052
February 2009

Main Findings

The study’s main findings include information about curriculum implementation and the relative effects of the curricula on student math achievement. Statistical tests were used to assess the significance of all the results. Hierarchical linear modeling (HLM) techniques—which account for the extent to which students are clustered in classrooms and schools according to achievement—were used to conduct the statistical tests. When comparing results for pairs of curricula, the Tukey-Kramer method (Tukey 1952, 1953; Kramer 1956) was used to adjust the statistical tests for the six unique pair-wise comparisons that can be made with four curricula, as described above. Only results that are statistically significant at the 5 percent level of confidence are discussed.5

Before presenting the main findings, it is worth mentioning the information that is and is not provided by the study. The relative effects of the curricula presented below reflect differences between the curricula, including differences in teacher training, instructional strategies, content coverage, and curriculum materials. Of course, the relative effects ultimately depend on how teachers implemented the curricula, and implementation reflects what publishers and teachers achieved, not some level of implementation specified by the study. Information about curriculum implementation presented in this report is based only on teacher reports—the study team is observing classrooms and plans to present that information in a future report.6 Also, the relative effects of the curricula are based only on the ECLS-K math assessment administered by the study team—in the third grade and perhaps even the second grade, districts administer their own math assessments to students and the study team is investigating the possibility of obtaining those scores for our future analyses of second and third graders. Lastly, because the participating sites are not a representative sample of districts and schools, the design does not support making statements about effects for districts and schools outside of the study.

Curriculum Implementation. The main findings from the implementation analysis are:

  • All teachers received initial training from the publishers and 96 percent received follow-up training. Taken together, training varied by curriculum, ranging from 1.4 to 3.9 days.
  • Nearly all teachers (99 percent in the fall, 98 percent in the spring) reported using their assigned curriculum as their core math curriculum according to the fall and spring surveys, and about a third (34 percent in fall and 36 percent in spring) reported supplementing their curriculum with other materials.
  • Eighty-eight percent of teachers reported completing at least 80 percent of their assigned curriculum.7
  • On average, Saxon teachers reported spending one more hour on math instruction per week than did teachers of the other curricula.

Achievement Effects. The figure below illustrates the relative effects of the study’s curricula on student math achievement. The figure includes a symbol for each of the four curricula, where the dot in the middle of each symbol indicates the average spring math score of students in the respective curriculum groups. The average scores are adjusted for baseline measures of several student, teacher, and school characteristics related to student spring achievement (such as student fall math scores) to improve the precision of the results. The bars that extend from each dot represent the 95 percent confidence interval around each average score. HLM techniques were used to calculate the average scores and confidence intervals.

Curricula with non-overlapping confidence intervals have average scores that are significantly different at the 5 percent level of confidence. The results are presented in standard deviations, which means that subtracting the average values (the dots) for any two curricula indicates the effect size of using the first curriculum instead of the second. The effect sizes discussed below were calculated by dividing each pair-wise curriculum comparison by the pooled standard deviation of the spring score for the two curricula being compared, and Hedges’ g formula (with the correction for small-sample bias) was used to calculate the pooled standard deviations. Appendix D presents averages of the unadjusted math scores (see Table D.3). The relative effects of the curricula described below are similar when based on the simple averages, although the confidence intervals are wider than those based on the HLM-adjusted averages, as expected.

(Refer to Average HLM-Adjusted Spring Math Score with Confidence Interval, by Curriculum (in standard deviations))

The figure shows that:

  • Student math achievement was significantly higher in schools assigned to Math Expressions and Saxon, than in schools assigned to Investigations and SFAW. Average HLM-adjusted spring math achievement of Math Expressions and Saxon students was 0.30 standard deviations higher than Investigations students, and 0.24 standard deviations higher than SFAW students. For a student at the 50th percentile in math achievement, these effects mean that the student’s percentile rank would be 9 to 12 points higher if the school used Math Expressions or Saxon, instead of Investigations or SFAW.
  • Math achievement in schools assigned to the two more effective curricula (Math Expressions and Saxon) was not significantly different, nor was math achievement in schools assigned to the two less effective curricula (Investigations and SFAW). The Math Expressions-Saxon and Investigations-SFAW differentials equal 0.02 and -0.07 standard deviations, respectively, and neither is statistically significant.

We also examined whether the relative effects of the curricula differ along six characteristics that differentiate instructional settings: (1) participating districts, (2) school fall achievement, (3) school free/reduced-price meals eligibility, (4) teacher education, (5) teacher experience, and (6) teacher math content/pedagogical knowledge that was measured before curriculum training began using an assessment administered by the study team. These characteristics were used to create 15 subgroups—one for each of the four districts, three based on school fall achievement, and two subgroups for each of the other four characteristics.

Eight of the fifteen subgroup analyses found statistically significant differences in student math achievement between curricula. The significant curriculum differences ranged from 0.28 to 0.71 standard deviations, and all of the significant differences favored Math Expressions or Saxon over Investigations or SFAW. There were no subgroups for which Investigations or SFAW showed a statistically significant advantage.


5 As mentioned above, the 5 percent level of confidence means there is no more than a 5 percent chance that any finding discussed could have occurred by chance.
6 Each classroom in the current sample was observed once during the 2006-2007 school year. Those observations are not presented in this report because the reliability of those data cannot be assessed until observations have been completed in all the study schools.
7 Adherence to the essential features of each curriculum also was examined and is presented in Chapter II. Several analytical approaches can be used to examine adherence, but only one approach could be supported by the relatively small teacher sample sizes that are currently available for each curriculum. We do not make any general statements about adherence in the executive summary because it would be useful to examine whether the results are sensitive to the other analytical approaches, and instead encourage readers interested in the adherence analysis we were able to conduct at this point to see Chapter II. A future planned report (described at the end of the executive summary) will have larger teacher sample sizes that can support the other analyses.