Appendix A1.1 Study characteristics: Agodini et al., 2009
| Characteristic | Description |
|---|---|
| Study citation | Agodini, R., Harris, B., Atkins-Burnett, S., Heaviside, S., Novak, T., & Murphy, R. (2009). Achievement effects of four early elementary school math curricula: Findings from first graders in 39 schools (NCEE 2009-4052). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. |
| Participants | The researchers recruited 40 schools from four geographically dispersed districts with Title I schools. Each district had to include at least four schools willing to participate in the study, to support implementation of the study’s four curricula in each district. Within each of the participating districts, the schools were randomly assigned to one of the four curricula prior to the start of the school year, thereby setting up an experiment in each district. Roughly 10 students were randomly selected for assessment from each first-grade classroom in the study schools. The 40 schools included 1,457 first-grade students from 134 classrooms. One school dropped out of the study, leaving 39 in the analysis sample. The analysis sample included 1,309 first-grade students in 131 classrooms. The relative effects of the curricula were calculated by comparing math achievement of students in the four curriculum groups at the end of the 2006–07 academic year. Sixty-nine percent of students were eligible for free or reduced-price lunch. Fifty-four percent of schools in the study were schoolwide Title I eligible, compared to 41 percent nationwide. |
| Setting | The four districts were located in Connecticut, Minnesota, New York, and Nevada. They included two districts in urban areas, one in a suburban area, and one in a rural area. Each district contained Title I schools. |
| Intervention | First-grade teachers implemented the Saxon Math curriculum published by Harcourt Achieve. |
| Comparison | Three other curricula were used in the study: (1) Investigations in Number, Data, and Space (Investigations); (2) Math Expressions; and (3) Scott Foresman–Addison Wesley Mathematics (SFAW). The authors note that a ”business-as-usual“ control group was not included because it would have contained a variety of curricula used by the participating districts, making it difficult to interpret effects of the individual curricula in the study. |
| Primary outcomes and measurement | The authors measured math achievement using the assessment developed for the National Center for Education Statistics’ Early Childhood Longitudinal Study–Kindergarten Class of 1998–99 (ECLS-K). For a more detailed description of the outcome measure, see Appendix A2. |
| Staff/teacher training | Teachers in the study received training by the publishers of their assigned curriculum. All teachers received a one-to-two-day training at the start of the school year and follow-up training during the school year. Ninety-six percent attended follow-up training on their assigned curriculum. |
Appendix A1.2 Study characteristics: Good, Bickel, & Howley, 2006
| Characteristic | Description |
|---|---|
| Study citation | Good, K., Bickel, R., & Howley, C. (2006). Saxon Elementary Math program effectiveness study. Charlestown, WV: Edvantia. |
| Participants | Participants were 1,476 students between kindergarten and third grade from 57 schools. In spring 2005, Harcourt Achieve sent Edvantia researchers a spreadsheet containing the names of U.S. schools implementing the Saxon Elementary School Math program. Edvantia staff randomly selected schools to participate in the study. Of the 40 Saxon schools asked, 33 agreed. Twenty-four comparison schools were selected based on their similarities to the experimental schools on several measures, including school size; grade-level configuration; percentage of students eligible for free and reduced-price school lunch (the conventional education-research proxy measure for poverty); percentage of racial and ethnic minority students; migrant percentages; charter school designation; Title I school designation; locale, for example, urban, rural, large town, or small town; and geographic location. Data with which to identify matches were obtained from the U.S. Department of Education’s National Center on Educational Statistics Common Core of Data for public schools from the 2003–04 school year. |
| Setting | The experimental and comparison schools were located across 16 states, including Alabama (1 school), Arizona (5 schools), California (6 schools), Georgia (3 schools), Indiana (1 school), North Carolina (9 schools), Nebraska (5 schools), Nevada (2 schools), New York (2 schools), Oklahoma (9 schools), Oregon (2 schools), Tennessee (2 schools), Texas (2 schools), Utah (1 school), Virginia (6 schools), and Washington (1 school). |
| Intervention | The intervention condition occurred over the 2005–06 school year. Teachers implemented the Saxon Elementary School Math program. |
| Comparison | Comparison-group teachers implemented a variety of other curricula, and some reported using skills that were part of the Saxon curriculum. The publishers of the programs tended to be Harcourt Brace, Houghton Mifflin, Silver Burdett Ginn, McGraw-Hill, and Scott Foresman. |
| Primary outcomes and measurement | The Stanford Achievement Test, Ninth Edition (SAT 9) was administered as the pretest and posttest measure of math achievement. Participating students completed only the math subtest of the SAT 9. In the fall, students took the appropriate grade-level versions of the SAT 9: the SESAT 1, SESAT 2, abbreviated Primary 1, or abbreviated Primary 2 tests, respectively, for kindergarten through third grade. The tests administered to K–3 students in the spring included the SESAT 2, abbreviated Primary 1, abbreviated Primary 2, and abbreviated Primary 3. The tests were administered by either the classroom teacher or the site coordinator. For a more detailed description of these outcome measures, see Appendix A2. |
| Staff/teacher training | Training is not described in the study. |
Appendix A1.3 Study characteristics: Good, Bickel, & Howley, 2006
| Characteristic | Description |
|---|---|
| Study citation | Resendez, M., & Manley, M. A. (2005). The relationship between using Saxon Elementary and Middle School Math and student performance on Georgia statewide assessments. Orlando, FL: Harcourt Achieve. |
| Participants | The participants in this study were students in grades 1–8 in 170 intervention schools and 172 comparison schools that were matched based on student demographics. This intervention report focuses only on findings for grades 1–5, because grades 6–8 are outside of the scope of this review.1 The authors selected Georgia schools that used the Saxon Elementary School Math curriculum between 2000 and 2005. The sample was obtained from the Georgia Department of Education. The authors note that per state policy, only school-level data could be released. Data for the intervention group came from 85 schools for first grade, 85 schools for second grade, 83 schools for third grade, 79 schools for fourth grade, and 79 schools for fifth grade. Data for the comparison group came from 144 schools for first grade, 144 schools for second grade, 135 schools for third grade, 131 schools for fourth grade, and 129 schools for fifth grade. The numbers of schools per grade are not mutually exclusive. Some of the schools contained multiple grades, so the numbers presented do not represent distinct clusters of schools. |
| Setting | The sample schools were distributed across the state of Georgia and represented a mixture of rural, urban, and suburban communities. The gender and racial compositions of the schools were similar in the intervention schools and comparison schools, with roughly equal gender distribution and more than half of the students white. Both study conditions were also similar in terms of the percent of students with disabilities, students with limited English proficiency, and students categorized as gifted. |
| Intervention | The Saxon Elementary School Math curriculum was used as a core curriculum in the intervention schools. The elementary schools in the sample used the version of the Saxon Elementary School Math program that was appropriate for each grade level, and participating schools had used the program for an average of three years (with a range of 1–15 years). |
| Comparison | The schools in the comparison group used a mixture of non-Saxon curricula. Sixty-two percent of the schools in the comparison group used basal math curricula with chapter-based approaches to teaching math. Five percent of the schools used curricula with an investigative approach. The remaining third of the schools used curricula that were a mix of basal, investigative, and computer-based approaches. The authors reported no significant differences in baseline math performance between the Saxon and non-Saxon schools. |
| Primary outcomes and measurement | The outcome measure was Georgia’s Criterion-Referenced Competency Test (CRCT), which assesses competency in number sense and numeration, geometry and measurement, patterns and relations/algebra, statistics and probability, computation and estimation, and problem solving. Fourth-grade students were tested in each school year from 1999–00 to 2004–05. First-grade, second-grade, third-grade, and fifth-grade students were tested in the spring of school years 2001–02, 2003–04, and 2004–05. All posttest scores are from spring 2005. For a more detailed description of this outcome measure, see Appendix A2. |
| Staff/teacher training | No information was provided regarding the teacher training for the intervention. |
| 1 Results from grades 6–8 are being reviewed as part of the WWC Middle School Math review. | |
Appendix A2 Outcome measures for the mathematics achievement domain
| Outcome measure | Description |
|---|---|
| Early Childhood Longitudinal Study–Kindergarten (ECLS-K), Math Assessment | This is an individually administered, nationally normed assessment capable of measuring math achievement gains from kindergarten through grade 8. It was developed for the National Center for Education Statistics’ Early Childhood Longitudinal Study–Kindergarten Class of 1998–99 (ECLS-K). |
| Stanford Achievement Test, Ninth Edition (SAT 9), Math Subtest | The SAT 9 math subtest is a nationally normed assessment published by Pearson Education. It is composed of two parts: problem solving and mathematics procedures. The SAT 9 math subtest was developed in alignment with the National Council of Teachers of Mathematics’ Curriculum and Evaluation Standards for School Mathematics.1 |
| Georgia’s Criterion-Referenced Competency Test (CRCT),2 Mathematics | As cited in Resendez and Manley (2005), the CRCT is a criterion-referenced test which is referenced to Georgia’s Quality Core Curriculum Goals. According to the Georgia Education, the CRCT is a multiple-choice test that is valid and reliable for Georgia’s public school students.3 The CRCT math scores range from 150 to 450, with scores below 300 not meeting standards and scores above 350 exceeding standards. The criteria for meeting the standards vary by objective and grade level. Five objectives are covered by the test: (1) numbers and number sense; (2) geometry and measurement; (3) patterns, relationships, and algebra; (4) computation and estimation; and (5) problem solving. The cut points are set by the state and take into account the difficulty of each specific objective. |
|
1 See the product description at http://www.pearsonassessments.com/HAIWEB/Cultures/en-us/Productdetail.htm?Pid=E139A. | |
Appendix A3 Summary of study findings included in the rating for the mathematics achievement domain1
| Authors’ findings from the study | ||||||||
|---|---|---|---|---|---|---|---|---|
| Mean outcome (standard deviation2) |
WWC calculations | |||||||
| Outcome measure | Study sample | Sample size (schools/ students) |
Saxon Math group | Comparison group | Mean difference3 (Saxon Math – comparison) | Effect size4 | Statistical significance5 (at α= 0.05) |
Improvement index6 |
| Agodini et al., 2009 (randomized controlled trial)7 | ||||||||
| ECLS-K | Grade 1 (versus Investigations) | 19/636 | 47.368 (7.62) |
44.87 (8.64) |
2.49 | 0.30 | Statistically significant | +12 |
| ECLS-K | Grade 1 (versus Math Expressions) | 18/618 | 45.278 (7.62) |
45.45 (8.97) |
–0.18 | –0.02 | ns | –1 |
| ECLS-K | Grade 1 (versus SFAW) |
20/663 | 46.218 (7.62) |
44.28 (8.27) |
1.93 | 0.24 | Statistically significant | +10 |
| Average for mathematics achievement (Agodini et al., 2009)9 | 0.17 | Statistically significant | +7 | |||||
| Good, Bickel, & Howley, 20067 | ||||||||
| SAT 9 | Grades K–3 | 57/1476 | 580.1010 (63.37) |
575.8210 (58.66) |
4.28 | 0.07 | ns | +3 |
| Average for mathematics achievement (Good, Bickel, & Howley, 2006)9 | 0.07 | ns | +3 | |||||
| Resendez & Manley, 20057 | ||||||||
| CRCT | Grade 1 | 229/nr |
86.2611 (nr) |
85.2011 (nr) |
1.06 | na12 | ns | na12 |
| CRCT | Grade 2 | 229/nr |
88.3111 (nr) |
86.8611 (nr) |
1.45 | na12 | ns | na12 |
| CRCT | Grade 3 | 218/nr |
86.9411 (nr) |
85.9311 (nr) |
1.01 | na12 | ns | na12 |
| CRCT | Grade 4 | 210/nr |
73.9211 (nr) |
71.3911 (nr) |
2.53 | na12 | ns | na12 |
| CRCT | Grade 5 | 208/nr |
82.8611 (nr) |
81.6611 (nr) |
0.80 | na12 | ns | na12 |
| Average for mathematics achievement (Resendez & Manley, 2005)9 | na12 | ns | na12 | |||||
| Domain average for mathematics achievement across all studies9 | 0.12 | na | +5 | |||||
|
ns = not statistically significant 1 This appendix reports findings considered for the effectiveness rating and the average improvement indices for the mathematics achievement domain. Subgroup and subtest findings from the same studies are not included in these ratings but are reported in Appendices A4.1 and A4.2, respectively. |
||||||||
Appendix A4.1 Summary of subgroup findings for the mathematics achievement domain1
| Authors’ findings from the study2 | ||||||||
|---|---|---|---|---|---|---|---|---|
| Mean outcome (standard deviation3) |
WWC calculations | |||||||
| Outcome measure | Study sample4 | Sample size (students)5 | Saxon Math group | Comparison group | Mean difference (Saxon Math – comparison) | Effect size6 | Statistical significance7 (at α = 0.05) |
Improvement index8 |
| Agodini et al., 20099 | ||||||||
| Comparison 1: Saxon Math compared with Investigations in Number, Data, and Space | ||||||||
| ECLS-K | Lowest third | 179 | nr10 | nr10 | nr10 | 0.71 | Statistically significant | +26 |
| ECLS-K | Middle third | 159 | nr10 | nr10 | nr10 | 0.17 | ns | +7 |
| ECLS-K | Highest third | 298 | nr10 | nr10 | nr10 | 0.15 | ns | +6 |
| ECLS-K | Up to 40% FRP | 378 | nr10 | nr10 | nr10 | 0.31 | ns | +12 |
| ECLS-K | Greater than 40% FRP | 258 | nr10 | nr10 | nr10 | 0.37 | ns | +14 |
| Comparison 2: Saxon Math compared with Math Expressions | ||||||||
| ECLS-K | Lowest third | 206 | nr10 | nr10 | nr10 | 0.32 | ns | +13 |
| ECLS-K | Middle third | 205 | nr10 | nr10 | nr10 | –0.20 | ns | –8 |
| ECLS-K | Highest third | 207 | nr10 | nr10 | nr10 | –0.08 | ns | –3 |
| ECLS-K | Up to 40% FRP | 316 | nr10 | nr10 | nr10 | –0.01 | ns | 0 |
| ECLS-K | Greater than 40% FRP | 302 | nr10 | nr10 | nr10 | –0.02 | ns | –1 |
| Comparison 3: Saxon Math compared with Scott Foresman–Addison Wesley Elementary Mathematics | ||||||||
| ECLS-K | Lowest third | 201 | nr10 | nr10 | nr10 | 0.56 | Statistically significant | +21 |
| ECLS-K | Middle third | 195 | nr10 | nr10 | nr10 | –0.01 | ns | 0 |
| ECLS-K | Highest third | 267 | nr10 | nr10 | nr10 | 0.18 | ns | +7 |
| ECLS-K | Up to 40% FRP | 346 | nr10 | nr10 | nr10 | 0.30 | ns | +12 |
| ECLS-K | Greater than 40% FRP | 317 | nr10 | nr10 | nr10 | 0.20 | ns | +8 |
|
ns = not statistically significant 1 This appendix presents subgroup findings for measures that fall in the mathematics achievement domain. Total group scores were used for rating purposes and are presented in Appendix A3. |
||||||||
Appendix A4.2 Summary of subscale findings for the mathematics achievement domain1
| Authors’ findings from the study | ||||||||
|---|---|---|---|---|---|---|---|---|
| Mean outcome (standard deviation)2 |
WWC calculations | |||||||
| Outcome measure | Study sample | Sample size (schools) | Saxon Math group3 | Comparison group3 | Mean difference4 (Saxon Math – comparison) | Effect size5 | Statistical significance6 (at α = 0.05) |
Improvement index7 |
| Resendez & Manley, 2005 (quasi-experimental design)8 | ||||||||
| CRCT: Numbers and number sense | Grade 1 | 229 | 89.53 (nr) |
88.52 (nr) |
1.01 | na9 | ns | na9 |
| CRCT: Geometry and measurement | Grade 1 | 229 | 90.34 (nr) |
90.29 (nr) |
0.05 | na9 | ns | na9 |
| CRCT: Patterns, relations, and algebra | Grade 1 | 229 | 87.88 (nr) |
86.28 (nr) |
1.60 | na9 | ns | na9 |
| CRCT: Computation and estimation | Grade 1 | 229 | 78.93 (nr) |
77.43 (nr) |
1.50 | na9 | ns | na9 |
| CRCT: Problem solving | Grade 1 | 229 | 84.64 (nr) |
83.49 (nr) |
1.15 | na9 | ns | na9 |
| CRCT: Numbers and number sense | Grade 2 | 229 | 88.57 (nr) |
86.62 (nr) |
1.95 | na9 | ns | na9 |
| CRCT: Geometry and measurement | Grade 2 | 229 | 91.46 (nr) |
92.36 (nr) |
–0.90 | na9 | ns | na9 |
| CRCT: Patterns, relations, and algebra | Grade 2 | 229 | 87.05 (nr) |
83.58 (nr) |
3.47 | na9 | Statistically significant | na9 |
| CRCT: Computation and estimation | Grade 2 | 229 | 86.93 (nr) |
85.83 (nr) |
1.10 | na9 | ns | na9 |
| CRCT: Problem solving | Grade 2 | 229 | 87.54 (nr) |
85.93 (nr) |
1.61 | na9 | ns | na9 |
| CRCT: Numbers and number sense | Grade 3 | 218 | 89.74 (nr) |
88.24 (nr) |
1.50 | na9 | ns | na9 |
| CRCT: Geometry and measurement | Grade 3 | 218 | 93.60 (nr) |
92.24 (nr) |
1.36 | na9 | ns | na9 |
| CRCT: Patterns, relations, and algebra | Grade 3 | 218 | 86.26 (nr) |
85.90 (nr) |
0.36 | na9 | ns | na9 |
| CRCT: Statistics and computation | Grade 3 | 218 | 87.13 (nr) |
85.83 (nr) |
1.30 | na9 | ns | na9 |
| CRCT: Computation and estimation | Grade 3 | 218 | 86.81 (nr) |
85.71 (nr) |
1.10 | na9 | ns | na9 |
| CRCT: Problem solving | Grade 3 | 218 | 78.11 (nr) |
77.64 (nr) |
0.47 | na9 | ns | na9 |
| CRCT: Numbers and number sense | Grade 4 | 210 | 71.47 (nr) |
70.85 (nr) |
0.62 | na9 | ns | na9 |
| CRCT: Geometry and measurement | Grade 4 | 210 | 79.22 (nr) |
78.16 (nr) |
1.06 | na9 | ns | na9 |
| CRCT: Patterns, relations, and algebra | Grade 4 | 210 | 69.76 (nr) |
67.70 (nr) |
2.06 | na9 | ns | na9 |
| CRCT: Statistics and computation | Grade 4 | 210 | 82.15 (nr) |
80.17 (nr) |
1.98 | na9 | ns | na9 |
| CRCT: Computation and estimation | Grade 4 | 210 | 73.12 (nr) |
67.65 (nr) |
5.47 | na9 | Statistically significant | na9 |
| CRCT: Problem solving | Grade 4 | 210 | 67.81 (nr) |
63.83 (nr) |
3.98 | na9 | Statistically significant | na9 |
| CRCT: Numbers and number sense | Grade 5 | 208 | 79.74 (nr) |
77.31 (nr) |
2.43 | na9 | ns | na9 |
| CRCT: Geometry and measurement | Grade 5 | 208 | 80.77 (nr) |
81.54 (nr) |
–0.77 | na9 | ns | na9 |
| CRCT: Patterns, relations and algebra | Grade 5 | 208 | 76.16 (nr) |
74.56 (nr) |
1.60 | na9 | ns | na9 |
| CRCT: Statistics and computation | Grade 5 | 208 | 79.82 (nr) |
81.52 (nr) |
–1.70 | na9 | ns | na9 |
| CRCT: Computation and estimation | Grade 5 | 208 | 88.74 (nr) |
86.62 (nr) |
2.12 | na9 | ns | na9 |
| CRCT: Problem solving | Grade 5 | 208 | 89.55 (nr) |
88.43 (nr) |
1.12 | na9 | ns | na9 |
|
ns = not statistically significant 1 This appendix presents subscale findings for measures that fall in the mathematics achievement domain. Total scale scores were used for rating purposes and are presented in Appendix A3. | ||||||||
Appendix A5 Saxon Elementary School Math rating for the mathematics achievement domain
The WWC rates an intervention’s effects for a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1
For the outcome domain of mathematics achievement, the WWC rated Saxon Elementary School Math as having mixed effects for elementary school students. The remaining ratings (no discernable effects, potentially negative effects, and negative effects) were not considered, as Saxon Elementary School Math was assigned the highest applicable rating.
| Rating received |
|---|
|
Mixed effects: Evidence of inconsistent effects as demonstrated through either of the following criteria.
OR |
| Other ratings considered |
|
Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.
AND |
|
Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.
AND |
| 1 For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Procedures and Standards Handbook, Appendix E. |
Appendix A6 Extent of evidence by domain
| Sample size | ||||
|---|---|---|---|---|
| Outcome domain | Number of studies | Schools | Students | Extent of evidence1 |
| Mathematics achievement | 3 | 325 | na | Medium to large |
|
na = not applicable/not studied. Total number of students not reported in all of the relevant studies. 1 A rating of ”medium to large“ requires at least two studies and two schools across studies in one domain and a total sample size across studies of at least 350 students or 14 classrooms. Otherwise, the rating is ”small.“ For more details on the extent of evidence categorization, see the WWC Procedures and Standards Handbook, Appendix G. |
||||