Skip Navigation

Regional Educational Laboratory Program


An Evaluation of Number Rockets: A Tier 2 Intervention for Grade 1 Students At Risk for Difficulties in MathematicsAn Evaluation of Number Rockets: A Tier 2 Intervention for Grade 1 Students At Risk for Difficulties in Mathematics

Regional need and study purpose

Over the last decade much attention in education has focused on early reading skills (U.S. Department of Education 2008a). A recent National Mathematics Advisory Panel report revealed an urgent need to focus on early math skills as well (U.S. Department of Education 2008b). American students lag behind students of other industrialized nations in math. In the 2003 Trends in International Mathematics and Science Study the United States ranked only 12th among 25 countries in grade 4 math and 15th among 46 countries in grade 8 math (Gonzales et al. 2004). The trend continues in older students; on the 2003 Program for International Student Assessment, American students age 15 placed 27th in math among 39 countries (Lemke et al. 2004). In the Southwest Region specifically, there is a large discrepancy1 in math performance among the states. On the 2007 National Assessment of Educational Progress, New Mexico (3rd) and Oklahoma (9th) fared relatively well nationally, but Louisiana (41st), Texas (43rd), and Arkansas (44th) ranked substantially lower (U.S. Department of Education 2007). The poor performance of students in math, both nationally and regionally, is of concern to stakeholders and policymakers.

Under the reauthorization of the Individuals with Disabilities Education Act of 2004, schools can now use alternative methods to determine student eligibility for special education services. The act also encourages schools to intervene as soon as students begin to struggle, before their performance declines (Individuals with Disabilities Education Act of 2004; U.S. Department of Education 2008c). One method for intervening early with struggling students—and proposed as an alternative to the ability-achievement discrepancy for identifying students with learning disabilities and for reducing referrals for special education services (Individuals with Disabilities Education Act of 2004)—are response to intervention (RTI) models (Fuchs and Fuchs 2006).

In RTI models schools provide interventions of increasing intensity to struggling students to improve their achievement (Gersten et al. 2009). RTI model interventions are multitiered: tier 1 consists of research-based core instruction delivered in the classroom and differentiated instruction based on individual student needs; tier 2 (and higher) comprises increasing levels of targeted and individualized instruction. Essential to the RTI model are valid and feasible screening measures to identify students at risk of not achieving grade-level performance at the end of the school year (tier 1) and validated interventions provided in addition to classroom instruction (tier 2) for these students (Gersten et al. 2009). Although current legislation indicates that RTI can play a role in determining a child's eligibility for special education, the goal is to provide instructional assistance to students as soon as they begin falling behind.

Empirical studies of response to intervention in reading have demonstrated its feasibility and possible benefits (Burns, Appleton, and Stehouwer 2005; Case, Speece, and Molloy 2003; Vaughn, Linan-Thompson, and Hickman-Davis 2003). These preliminary results, however, were based on quasi-experimental study designs.2 Thus, while they were able to link interventions with improved student achievement, they could not claim that the intervention caused the improvement, as studies employing randomized controlled trials could. A recent review of tier 2 interventions in beginning reading identified 18 studies (of which only 5 were randomized controlled trials) that analyzed the effects of response to intervention on student reading achievement (Wanzek and Vaughn 2007). In math there are four such studies of tier 2 interventions that use random assignment (Newman-Gonchar, Clarke, and Gersten 2009).

Small group tutoring is one of the few tier 2 math interventions that has demonstrated positive results based on a smaller scale study employing random assignment. The intervention employs small groups to deliver instruction. The current study is an effectiveness evaluation of the Fuchs et al. (2005) efficacy study of the small group tutoring intervention in math for at-risk students in grade 1. Fuchs et al. demonstrate that the intervention is effective across several schools within a single district. The current study will provide rigorous causal evidence of whether the intervention can be effective in four urban districts in four states. If the intervention is effective on this larger scale, it may meet an important need for validated tier 2 math interventions suitable for use within RTI models.

Efficacy trials help researchers determine whether an intervention (in this case, small group tutoring) is feasible and practical for implementation in schools and whether the intervention produces the desired impact for a target population (in this case, improving math achievement for at-risk students). In the Fuchs et al. (2005) efficacy study small group tutoring was shown to be both feasible and practical and to improve student performance in math in one school district.

In effectiveness studies such as the current study researchers replicate interventions that have demonstrated positive effects in smaller efficacy trials. This is done across a variety of settings to determine whether the interventions still demonstrate positive results when implemented on a larger scale.

The goal of this study is to evaluate on a large scale the Fuchs et al. (2005) small group tutoring intervention for at-risk students in grade 1. The study addresses one question:

  • Do grade 1 students at risk in math receiving the small group tutoring intervention outperform at-risk control students on the Test of Early Mathematics Ability–Third Edition (Ginsburg and Baroody 2003)?

This study is not designed to provide rigorous causal evidence of program impacts for specific student subgroups, such as English language learner students. The tutoring was available for English-speaking students only and did not address issues that English language learner students at risk in math might have. In addition, the results represent only between 70–80 percent of the students in grade 1 at participating schools because not all parents consented to their child's participation in the study. The study was also restricted to four large urban school districts in the Southwest Region. If the intervention is found effective in the current study, it would provide additional evidence (beyond Fuchs et al. 2005) that the intervention is generalizable to other settings and can be implemented at scale. However, the study design does not enable conclusive statements to be made about generalization. The schools and districts were not sampled at random, nor were they representative of any population. Results from this study will not necessarily apply to other settings, such as districts in rural areas or districts with different demographics.

1 The ability-achievement discrepancy is typically defined as a significant difference between a child's intelligence test score and scores on achievement tests (Sattler 2001).

2 The gold standard for research design is the randomized controlled trial (Shadish, Cook, and Campbell 2002). These designs are characterized by random assignment of individuals (or units) to two or more experimental conditions. They provide the strongest evidence of causality, supporting statements that the intervention condition was the cause of an observed effect. Quasi-experimental studies, often those reporting only correlations or regression analyses, do not employ random assignment and may not have control or comparison conditions. The lack of random assignment increases the possibility that other variables, not controlled by the researcher, may have caused any observed effects or relationships. Quasi-experimental studies do not provide the same level of evidence to support causal statements—such as "this intervention caused this outcome"—as do randomized controlled trials.

Return to Index