Skip Navigation

What Works Clearinghouse


Appendix A1.1 Study characteristics: Rouse & Krueger, 2004

Characteristic Description
Study citation Rouse, C. E., & Krueger, A. B. (2004). Putting computerized instruction to the test: A randomized evaluation of a “scientifically based” reading program. Economics of Education Review, 23(4), 323–338.
Participants Groups were formed through a multistep process. Authors first identified an eligible population of students from four schools within one urban school district, focusing on third- to sixth-grade students who scored in the bottom 20% on the state’s standardized reading test administered in the 2001–02 school year. Consent letters were sent to these students’ parents. Principals in the schools were asked to identify students who could not sit through the daily 90- to 100-minute use of Fast ForWord®, those who had transferred to another school, and those students who might otherwise be unavailable (family away on long trip, for example). The remaining students were randomly assigned to either the treatment or control group, within each grade and school. In all, 237 students in the Fast ForWord® group and 217 students in the comparison group were included in the analysis sample.
Setting The study took place in four schools in an urban district in the northeastern United States. Forty percent of the district’s students were African-American and more than 50% were Hispanic. Almost 70% of students in the district qualified for the free or reduced-price lunch program, and 56% of the district’s students spoke a language other than English at home. The authors describe test scores in these schools as well below average and note that schools in the district adopted a whole-school reform, Success for All.
Intervention Fast ForWord® was primarily an add-on to regular reading instruction. In three schools, students in the treatment condition were pulled out of their regular classroom instruction for 90–100 minutes of computerized Fast ForWord® instruction per day and, in one school, they used Fast ForWord® for that same amount of time before or after school. Each school had to find a way to fit the use of Fast ForWord® into its unique schedule. In no case were students taken out of Success for All. The study reported students’ outcomes after six to eight weeks of program implementation.
Comparison The control group continued to receive the standard curriculum being used in district schools. Because the Fast ForWord® students used Fast ForWord® either during subjects such as math, science, language arts, special subjects (such as art, music, or gym), or homeroom, or—in the case of one school—before or after school, the counterfactual condition for the control group students was mixed.
Primary outcomes and measurement For both the pretest and posttest, the authors administered the Success for All assessment, the Clinical Evaluation of Language Fundamentals–Third Edition (the receptive portion and the Listening to Paragraph supplemental test), and a state standardized reading test (the authors did not indicate which state). For a more detailed description of test outcome measures, see Appendix A2.4.
Staff/teacher training Fast ForWord® staff provided training for Fast ForWord® instructors (those interacting with students) at the beginning of the study. Phone support was also provided for the duration of the study. Detailed information on the training of instructors was not provided.

Top

Appendix A1.2 Study characteristics: Scientific Learning Corporation, 2007a

Characteristic Description
Study citation Scientific Learning Corporation. (2007a). Students in Western Australia improve language and literacy skills: Educator's briefing. Oakland, CA: Author.
Participants Students between ages 5 and 14 identified by classroom teachers as having difficulties in language, literacy, auditory processing, attention, and/or behavior were randomly assigned to immediate or delayed treatment conditions, with 72 students in each group. The intervention group that received Fast ForWord® either between February and April or May and July of 2006 was compared to the group of students who had not received Fast ForWord® as of April 2006. In all, 68 students in the Fast ForWord® group and 69 students in the comparison group were included in the analysis sample.
Setting The study took place at four primary schools in the Perth metropolitan area in Western Australia.
Intervention Fast ForWord® participation was scheduled during class time for most students, generally in place of their language-arts lesson. A few students participated before school and during recess and/or lunch breaks. All Fast ForWord® sessions were monitored by trained parent volunteers under the supervision of the school’s Fast ForWord® coordinator. Participants in the Fast ForWord® group used (1) the 50-minute Fast ForWord® Language protocol or the 48-minute Fast ForWord® Middle and High School protocol and (2) the 50-minute Fast ForWord® Language to Reading protocol. These protocols called for participants to use Fast ForWord® each day, five days a week, for 8 to 12 weeks. The study reported students’ outcomes after three months of program implementation.
Comparison The counterfactual in this study is regular classroom instruction. The comparison group used Fast ForWord® on a delayed schedule, either between May and July or July and September 2006.
Primary outcomes and measurement All tests were administered by speech pathology and occupational therapy students who were trained in the assessment process by qualified speech pathologists. Study students’ skills were measured both before and after use of the intervention. Alphabetic skills were measured by the Queensland University Inventory of Literacy (QUIL), whereas students’ skills in comprehension were measured by the Clinical Evaluation of Language Fundamentals (CELF)–Fourth Edition. For a more detailed description of these outcome measures, see Appendices A2.1 and A2.4.
Staff/teacher training Sonic Hearing, a private clinical practice with expertise in the Fast ForWord® programs, provided training for the parent monitors and support for the Fast ForWord® coordinator at each school. All Fast ForWord® sessions were monitored by these trained parent volunteers, under the supervision of the school’s Fast ForWord® coordinator. In addition, the lab supervisors at the schools were trained in current and established findings on the neuroscience of how phonemic awareness and the acoustic properties of speech affect development of language and reading skills, information on the efficacy of the products, effective implementation techniques, and monitoring student progress.

Top

Appendix A1.3 Study characteristics: Beattie, 2000

Characteristic Description
Study citation Beattie, K. K. (2000). The effects of intensive computer-based language intervention on language functioning and reading achievement in language-impaired adolescents (Doctoral dissertation, George Mason University, 2000). Dissertation Abstracts International, 61(08A), 194–3116.
Participants Eighty-one 11- to 16-year-old students who scored in the bottom quartile on standardized reading or language tests were randomly assigned by computer-generated procedures to one of four intervention groups or to a control group in a two-step process.1 The researchers first assigned 18 students to the two intervention groups that received a phase of SuccessMaker and Fast ForWord® and also concomitantly participated in a functional resonance imaging research project. Then, the remaining participants were randomly assigned across the five groups. To ensure an equal distribution among groups, fewer students were placed in the first two groups at the second step of randomization. For this review, the WWC reported results from 12 students in the Fast ForWord® group who were compared to 12 students in the comparison group.2 Although the overall attrition rate was higher than 20%, the post-attrition intervention and comparison groups were equivalent on the pretest achievement measures.
Setting The study took place in two middle schools and one middle-high school located in the suburbs of a large metropolitan area in northern Virginia.
Intervention Students worked on Fast ForWord® for 90–94 minutes a day, five days a week. The intervention ended after each student completed 64–80 hours on the program. The study reported students’ outcomes after two months of program implementation.
Comparison The control group received the standard instruction provided as part of the regular school curriculum.
Primary outcomes and measurement For both pre- and posttests, the author administered the Gray Oral Reading Test, four subtests of the Woodcock-Johnson Psycho-Educational Battery (Letter-Word Identification, Word Attack, Passage Comprehension, and Auditory Processing), the Spelling subtest of the Wide Range Achievement Test, and the Receptive Language subtest of the Clinical Evaluation of Language Fundamentals. For a more detailed description of these outcome measures, see Appendices A2.1A2.4.
Staff/teacher training No information on training for the teachers and staff in this study was provided. To facilitate the use of Fast ForWord®, computers were procured or updated to meet criteria for running Fast ForWord® software.
1 The first intervention group received two phases of Fast ForWord®; the second intervention group received two phases of SuccessMaker; and the third and fourth intervention groups received a phase of Fast ForWord® and a phase of SuccessMaker.
2 The analysis samples for the Fast ForWord® and SuccessMaker groups were not shown to be equivalent at baseline. Two other groups which combined Fast ForWord® and SuccessMaker are not appropriate counterfactuals because the measures of effects cannot be attributed solely to the Fast ForWord® program.

Top

Appendix A1.4 Study characteristics: Borman & Benson, 2006

Characteristic Description
Study citation Borman, G. D., & Benson, J. (2006). Can brain research and computers improve literacy? A randomized field trial of the Fast ForWord® Language computer-based training program (WCER working paper no. 2006-5). Madison, WI: University of Wisconsin–Madison, School of Education.
Participants1 Students were eligible for the study if they scored below national norms on the total reading outcome for the district-administered Comprehensive Test of Basic Skills–Fifth Edition (CTBS/5) during the spring of 2000. These students also tended to have below-average outcomes on language skills. A total of 274 of these academically at-risk seventh-grade students took pretests (CTBS/5) in the spring of 2001. Random assignment was conducted separately within each of seven schools. Of the initial intervention and comparison students, listwise deletion of students with missing pretest or posttest data was conducted. Additionally, 13 students (eight from the treatment group and five from the control group) were dropped from the sample because they were determined to be outliers based on a substantial drop from pre- to posttest. In all, 90 students in the Fast ForWord® group and 98 students in the control group were included in the analysis sample (therefore, overall attrition was 31%). Although differential attrition between the treatment and control groups was 8%, the treatment and control groups were shown to be similar to each other at baseline. The groups primarily consisted of African-American (66.3% of both the intervention and comparison groups) and economically disadvantaged students (73.3% of the intervention group and 84.7% of the comparison group received free lunch).
Setting The study took place in seven middle schools in the Baltimore City Public School System.
Intervention In addition to their regular reading instruction, students randomly assigned to the intervention condition used the Fast ForWord® Language software program in school resource rooms. The resource rooms served as a targeted pullout program offered during the regular school day to supplement the regular classroom literacy instruction. Students received the program 100 minutes a day, five days a week, for at least 20 days under the supervision of an Fast ForWord®-trained teacher. The study reported students’ outcomes after two months of program implementation.
Comparison In addition to their regular reading instruction, comparison group students received nonliteracy instruction or participated in special activities and classes not related to literacy, such as art and gym.
Primary outcomes and measurement The eligible outcomes are standardized (normal curve equivalent) CTBS/5 Terra Nova Language and Reading test scores. These tests were administered both before and after the intervention. For a more detailed description of test outcome measures, see Appendix A2.4.
Staff/teacher training Before the start of the program, Scientific Learning provided training sessions for teachers operating the Fast ForWord® programs at the schools. No detailed information about these training sessions was provided by the authors.
1 In addition to the 188 students included in the analysis sample, the study also included 112 second-grade students who were excluded from the findings in this report because they did not fall in the grade range specified in the protocol.

Top

Appendix A1.5 Study characteristics: Overbay & Baenen, 2002

Characteristic Description
Study citation Overbay, A., & Baenen, N. (2002). Fast ForWord® evaluation, 2002–03 (Evaluation and Research, report no. 03.24). Raleigh, NC: Wake County Public School System.
Participants During the 2002–03 school year, 616 third- to eighth-grade students received the Fast ForWord® program. Of these, 426 were matched with students from non-Fast ForWord® schools based on race, limited English proficiency status, a special programs code, free and reduced-price lunch status, and reading pretest scores. The remaining 190 were missing either pre- or posttest scores and, therefore, were not included in the matching process. The analysis sample for this review included students in grades 4–8: 355 students in the Fast ForWord® group and 355 in the comparison group.1 Fast ForWord® was used in 10 elementary, middle, and high schools, and the comparison students were selected from schools that did not use Fast ForWord®. Additional findings reflecting students’ outcomes by grade can be found in Appendix A4.
Setting The study took place in one school district (10 treatment schools) in Raleigh, North Carolina.2
Intervention During the school year, the intervention group used Fast ForWord® Language, Fast ForWord® Language to Reading, and Fast ForWord® Reading. Most of the Fast ForWord® participants (91.4%) used Fast ForWord® Language; the majority (60%) used more than one level of the program. The 8.6% who did not use Fast ForWord® Language had completed it in 2001–02.
Comparison The counterfactual in this study is regular classroom instruction. However, the study authors note that students in the comparison group may have been exposed to a variety of other programs or interventions that were not controlled for in this study.
Primary outcomes and measurement For both pre- and posttests, the authors used the End of Grade Reading Subtest. For a more detailed description of this outcome measure, see Appendix A2.3.
Staff/teacher training No information about teacher or staff training was provided.
1 The study also presented data for students in grade 3, attending a total of six elementary schools, but these students do not fall within the age range of the WWC’s Adolescent Literacy reviews, so they are not included in this report.
2 The number of control schools is not available.

Top

Appendix A1.6 Study characteristics: Scientific Learning Corporation, 2004a

Characteristic Description
Study citation Scientific Learning Corporation. (2004a). Improved Ohio reading proficiency test scores by students in the Springfield City School District who used Fast ForWord® products. MAPS for Learning: Educator Reports, 8(8), 1–6.
Participants Fourth-grade students who did not pass the fall 2002 Ohio Proficiency Test from four Title I designated schools were eligible to participate in the study. Each elementary school established its own method of identifying treatment and comparison group students for the study. The comparison group was formed by selecting 50 students with test scores from both fall 2002 and spring 2003 who had no exposure to Fast ForWord® products.1 The intervention and comparison groups were shown to be equivalent on the Ohio Reading Proficiency Test pretest scores. In all, 41 students who used the Fast ForWord® products and 50 students in the comparison group were included in the analysis sample.
Setting The study took place in four elementary schools in the Springfield City School District in Ohio.
Intervention The study used Fast ForWord® Language, Fast ForWord® Language to Reading, and Fast ForWord® to Reading 3 products. The Fast ForWord® Language protocol called for students to use the product for 100 minutes a day, five days a week, for four to eight weeks. The Fast ForWord® Language to Reading and Fast ForWord® to Reading 3 protocols called for use of the product for 90 minutes a day, five days a week, for four to eight weeks. Students included in the treatment group were required to have used Fast ForWord® products for 20 or more days. Schools used different implementation models, with some schools having students use the products in the back of the classroom and other schools sending students to computer labs that served between 7 and 24 students. The study reported students’ outcomes after one semester of program implementation.
Comparison The study did not describe the comparison condition. Presumably, the comparison group received the regular school curriculum.
Primary outcomes and measurement The Ohio Reading Proficiency Test (a statewide assessment) was administered in the year of the study, before and after the intervention. For a more detailed description of this outcome measure, see Appendix A2.3.
Staff/teacher training At each participating school, educators were trained in current and established neuroscience findings on how phonemic awareness and the acoustic properties of speech affect development of language and reading skills, information on the efficacy of the products, methods for assessment of potential candidates for participation, the selection of appropriate measures for testing and evaluation, effective implementation techniques, approaches for using Progress Tracker reports to monitor student performance, and techniques for measuring the gains students have achieved after they have finished using Fast ForWord® products.
1 The study authors did not provide detailed information on how comparison group students were selected (stating that comparison group students were “pseudo-randomly” selected).

Top

Appendix A1.7 Study characteristics: Scientific Learning Corporation, 2004b

Characteristic Description
Study citation Scientific Learning Corporation. (2004b). Improved reading achievement by students in the school district of Philadelphia who used Fast ForWord® products. MAPS for Learning: Educator Reports, 8(21), 1–6.
Participants Three groups of students in grades 2 to 8 (mainly fourth and fifth graders) participated in Fast ForWord® supplemental instruction during the 2003–04 school year. Groups 1 and 2 comprised the treatment group for this study. Group 1 used Fast ForWord® between September and November, and group 2 used Fast ForWord® between December and February. Group 3 served as the comparison group (and used Fast ForWord® between March and May). The participating schools determined which students were placed in the three groups. Students were assessed in September and March. In all, 256 students in the Fast ForWord® treatment group and 37 students in the comparison group were included in the analysis sample. Additional findings reflecting students’ outcomes by grade and intervention group (1 versus 2) can be found in Appendix A4.
Setting The study took place in 16 schools in the Philadelphia School District in Pennsylvania.
Intervention Students participating in the Fast ForWord® group used a variety of Fast ForWord® products. All students used either the Fast ForWord® Language or Fast ForWord® Middle and High School product for an average of 25 days. In addition, about half of the students used Fast ForWord® Language to Reading products (which are part of the Fast ForWord® Language series), and one-tenth of the students used Fast ForWord® Reading 3 products (which are part of the Fast ForWord® Reading series). Fast ForWord® was used as a supplement to the regular reading curriculum. The study reported students’ outcomes after three months of program implementation.
Comparison Before March 2004, comparison group students received their regular reading curriculum.
Primary outcomes and measurement The eligible outcome on this study is the Gates–MacGinitie Reading Test, which was administered both before and after the intervention. For a more detailed description of this outcome measure, see Appendix A2.3.
Staff/teacher training Teachers were trained in current and established findings on the neuroscience of how phonemic awareness and acoustic properties of speech impact development of language and reading skills; information on the efficacy of the products; methods for assessment of potential product participants; the selection of appropriate standardized language measures for testing and evaluation; effective implementation techniques; instruction on the product, Progress Tracker, and the reports generated by the product that allow educators and coaches to monitor student performance; and techniques for measuring the progress and gains students achieve after they have finished using the product.

Top

Appendix A1.8 Study characteristics: Scientific Learning Corporation, 2007b

Characteristic Description
Study citation Scientific Learning Corporation. (2007b). Improved reading skills by students in the South Madison Community School Corporation who used Fast ForWord® products. MAPS for Learning: Educator Reports, 11(34), 1–7.
Participants Two schools that used Fast ForWord® during the spring of 2007 selected students in grades 2 to 5 for the study based on their scores on the Measures of Academic Progress (MAP) assessment. To form a comparison group, school personnel individually matched—by grade level and fall and winter scores from the MAP Reading subtest—80 students in the Fast ForWord® group to 80 students not using Fast ForWord®. The study sample included 78 treatment and 78 comparison students. The analysis sample for this review included students in grades 4 and 5: 35 students in the Fast ForWord® group and 35 students in the comparison group.
Setting This study took place in East Elementary and Maple Ridge Elementary in the South Madison Community School Corporation of Pendleton, Indiana.
Intervention The intervention groups used Fast ForWord® Language and Fast ForWord® Language to Reading products. The South Madison Community School Corporation chose to use the 50-minute Fast ForWord® protocols, which called for students to use the product for 50 minutes a day, five days per week, for 6 to 10 weeks. The study reported students’ outcomes after three months of program implementation.
Comparison The comparison group received the standard district reading curriculum.
Primary outcomes and measurement The outcomes on this study are students’ reading and language scores on the MAP assessment, which was administered both before and after the intervention was used for the study. For a more detailed description of this outcome measure, see Appendices A2.3A2.4.
Staff/teacher training Educators were trained in current and established neuroscience findings on how phonemic awareness and the acoustic properties of speech impact development of language and reading skills, information on the efficacy of the products, methods for assessing potential candidates for participation, the selection of appropriate measures for testing and evaluation, effective implementation techniques, approaches for using Progress Tracker reports to monitor student performance, and techniques for measuring the gains students have achieved after they have finished using Fast ForWord® products.

Top

Appendix A2.1 Outcome measures for the alphabetics domain

Outcome measure Description
Phonemic awareness
Woodcock-Johnson Psycho-Educational Battery–Revised, Tests of Cognitive Ability (WJ-R COG) (Auditory Processing Cluster for Phonemic Awareness) This composite is a standardized measure of a student’s ability to identify patterns among speech-based auditory stimuli. The score on this composite is derived from scores on three subtests: (1) the Sound Blending subtest measures the ability to synthesize sequences of sounds into whole words, (2) the Incomplete Words subtest measures the ability to identify a word with missing sounds, and (3) the Sound Patterns subtest measures the ability to indicate whether pairs of computer-generated sound sequences are the same or different (as cited in Beattie, 2000).
Phonics
Woodcock-Johnson Psycho-Educational Battery–Revised, Tests of Achievement (WJ-R ACH) (Letter-Word Identification) This standardized subtest, which assesses students’ ability to identify words and letters, requires students to read aloud isolated letters and real words that range in frequency and difficulty (as cited in Beattie, 2000).
Woodcock-Johnson Psycho-Educational Battery–Revised, Tests of Achievement (WJ-R ACH) (Word Attack) This standardized subtest measures phonemic decoding skills by asking students to read “pseudo” words (e.g., plurp, fronkett). Students are aware that the words are not real (as cited in Beattie, 2000).
Wide Range Achievement Test–Third Edition (WRAT-3) (Spelling subtest) This standardized subtest is a paper-and-pencil assessment that measures students’ ability to write their names, as well as letters and words from dictation. The dictated letters and words followed either phonetically regular or irregular patterns (as cited in Beattie, 2000).
Queensland University Inventory of Literacy (QUIL) The QUIL is a standardized clinical assessment tool for measuring the phonological awareness skills of school-age children as they pertain to literacy. Three of the 10 subtests were administered to all students: Nonword Spelling, Phoneme Segmentation, and Phoneme Manipulation. In addition, students in years 4–7 were administered the Spoonerisms subtest, which assesses students’ metalinguistic phoneme awareness (as cited in Scientific Learning Corporation, 2007a).

Top

Appendix A2.2 Outcome measures for the reading fluency domain

Outcome measure Description
Gray Oral Reading Test–Third Edition (GORT-3) In this standardized test, students are required to read orally a variety of graded passages to measure reading rate, word identification, and comprehension skills. The Passage subtest assesses a combination of rate and accuracy. The Comprehension subtest requires a student to respond to five multiple choice questions following each story. The Oral Reading Quotient reflects a total measure of a student’s oral reading performance and is calculated by combining the Passage and Comprehension scores (as cited in Beattie, 2000).

Top

Appendix A2.3 Outcome measures for the comprehension domain

Outcome measure Description
Reading comprehension and vocabulary development
Woodcock-Johnson Psycho-Educational Battery–Revised, Tests of Achievement (WJ-R ACH) (Passage Comprehension) In this standardized test, comprehension is measured by having students fill in missing words in a short paragraph (e.g., “Woof,” said the __________, biting the hand that fed it.) (as cited in Beattie, 2000).
Ohio Proficiency Test (OPT), Reading subtest This statewide assessment is administered to students in 4th, 6th, and 9th grade. The Reading subtest includes multiple choice, short answer, and extended response questions across four subscales: constructing meaning from fiction, examining/extending meaning in fiction, constructing meaning from nonfiction, and examining/extending meaning in nonfiction. The subtest contains two or three fiction or poetry selections and two or three nonfiction selections, which may include pamphlets, instruction booklets, and newspaper and magazine articles. The selections total about 1,200 to 1,500 words. Students may be asked to summarize or retell a story, to interpret vocabulary, or to infer information. Students may also be asked to make predictions, to distinguish facts from opinions, or to fill in a chart or diagram with information from the selection. Word usage, grammar, spelling, and mechanics do not affect scoring, unless the student’s ideas are not clear to the evaluator (as cited in Scientific Learning Corporation, 2004a).
North Carolina End of Grade Test The North Carolina End of Grade test measures students’ achievement of the goals and objectives specified in the 2004 North Carolina English Language Arts Standard Course of Study (Content Standards). Reading comprehension is assessed by having students read authentic selections and then answer questions directly related to the selections. Knowledge of vocabulary is assessed indirectly through application and understanding of terms within the context of selections and questions. The authentic selections in the reading tests are chosen to reflect reading for various purposes such as literary experience, gaining information, and performing a task (as cited in Overbay & Baenen, 2002).
Gates–MacGinitie Reading Test (GMRT) The GMRT is used to assess a student’s decoding, vocabulary, and passage comprehension skills.1 The test has two components that independently assess reading vocabulary and comprehension skills. The Vocabulary subtest measures each student’s reading vocabulary by asking the student to choose one word or phrase that means most nearly the same as a presented word. The subtest contains 45 questions. The Comprehension subtest measures each student’s ability to read and understand different types of prose. The subtest contains 11 passages of various lengths and subjects, and 48 questions (as cited in Scientific Learning Corporation, 2004b).
Measures of Academic Progress (MAP), Reading test Developed by the Northwest Evaluation Association (NWEA), the MAP are state-aligned computerized adaptive tests that reflect the instructional level of each student and measure growth over time. The MAP is appropriate for students in grades 2 through 10. The untimed assessment typically features between 40 and 50 items. The assessment is usually tailored to the specific needs of individual organizations, but all NWEA MAP assessments draw from the same item bank. The Reading test draws items from the following areas: word meaning (such as use of context clues; use of synonyms, antonyms, and homonyms; use of component structure; or interpretation of multiple meanings), literal comprehension (such as recalling details, interpreting directions, sequencing details, classifying facts, or identifying main ideas), interpretive comprehension (such as drawing inferences, recognizing cause and effect, predicting events, or summarizing and synthesizing), and evaluative comprehension (such as distinguishing fact and opinion, recognizing elements of persuasion, evaluating validity and point of view, evaluating conclusions, or detecting bias and assumptions) (as cited in Scientific Learning Corporation, 2007b).
Comprehensive Test of Basic Skills (CTBS/5) Terra Nova Reading Composite This assessment combines selected-response items with constructed-response items that allow students to produce short and extended responses. The Reading composite score is the average of Reading Comprehension and Vocabulary subtest scores (as cited in TerraNova Prepublication Technical Manual, July 1996). The Reading Comprehension subtest items focus on five objectives: (1) oral comprehension of passages read aloud, (2) basic understanding of literal meanings of passages, (3) analyzing text, (4) evaluating and extending meaning, and (5) identifying reading strategies. The Vocabulary subtest focuses on three objectives: (1) understanding word meaning, (2) identifying multi-meaning words, and (3) inferring words in context (as cited in Borman & Benson, 2006).
1 At levels D (4th grade) and up, either subtest or the combination of both subtests falls into the comprehension domain. At levels A, B, and C (grades 1, 2, and 3), the vocabulary measure, which taps decoding skills rather than word meanings, would fall in the alphabetics domain. For the Scientific Learning Corporation (2004b) study, which included students from grades 2–8, the WWC classified the Gates–MacGinitie Reading Test as a comprehension measure, as the majority of study participants came from grades 4 and 5 (levels D and up).

Top

Appendix A2.4 Outcome measures for the general literacy achievement domain

Outcome measure Description
General reading achievement
Comprehensive Test of Basic Skills (CTBS/5) Terra Nova Language Composite This assessment combines selected-response items with constructed-response items that allow students to produce short and extended responses. The Language Composite score is the average of scores on the Language and Language Mechanics subtests (as cited in TerraNova Prepublication Technical Manual, July 1996). The Language subtest covers four objectives: (1) introduction to print, (2) understanding sentence structure, (3) writing strategies, and (4) editing skills. The Language Mechanics subtest focuses on three objectives: (1) appropriate construction of sentences, phrases, and clauses; (2) appropriate writing conventions; and (3) editing skills (as cited in Borman & Benson, 2006).
Success for All assessment The Success for All assessments (which are administered every 6–8 weeks) are a set of reading assessments closely aligned with the Success for All curriculum. The total score of the assessments reflects (1) students’ scores on a paper-and-pencil assessment and (2) a more subjective assessment (by the evaluator) of the student’s class work during the time period. For example, the subjective assessment might evaluate how well children understand the learning objective, how their writing has progressed, and how well they comprehend what is read to them. Therefore, the total score not only reflects students’ reading and writing achievement, but it can also reflect educational behaviors and habits (e.g., note taking, direction following, attention and focus). The version of the paper-and-pencil assessment administered to students depends on students’ ability level and language proficiency. The assessments are designed to closely match the individual state’s assessment in both content and format (as cited in Rouse & Krueger, 2004).
State Standardized Reading Test This is the state’s criterion-referenced standardized test (the study authors did not specify which state). The exam is designed to be aligned with the curriculum standards of the state as well as to parallel critical aspects of the National Assessment of Educational Progress (NAEP). The state administers tests in reading, math, and writing annually (as cited in Rouse & Krueger, 2004).
Clinical Evaluation of Language Fundamentals– Third Edition (CELF-3), Receptive Language This standardized assessment measures a student’s ability to interpret and execute commands of increasing complexity and to understand relationships between words and categories. It addresses sentence structure, concepts and directions, and word classes (as cited in Beattie, 2000). The Receptive Language portion of the assessment includes five components: (1) sentence structure, in which students point to one of four pictures in response to an orally presented stimulus; (2) concepts and directions, in which students identify pictures of geometric shapes in response to orally presented direction; (3) semantic relations, in which students listen to four facts and then select two of four visually presented options; (4) word classes, in which students select two out of three or four orally presented words that go together; and (5) recalling sentences, in which students imitate an orally presented sentence (as cited in Rouse & Krueger, 2004).
CELF-4—Australian Standard Edition, Receptive Language CELF-4 is a standardized test widely used to measure a student’s overall oral language ability. The Receptive Language index is a cumulative measure of students’ performance on subtests designed to best probe receptive aspects of language including comprehension and listening. The subtests cover topics such as Concepts & Following Directions, Word Classes, and Sentence Structure (as cited in Scientific Learning Corporation, 2007a).
CELF-4—Australian Standard Edition, Expressive Language CELF-4 is a standardized test widely used to measure a student’s overall oral language ability. The Expressive Language index is a cumulative measure of students’ performance on subtests that probe expressive aspects of language including oral language expression. The subtests cover topics such as Word Structure, Recalling Sentences, and Formulated Sentences (as cited in Scientific Learning Corporation, 2007a).
Measures of Academic
Progress (MAP),
Language Test
Developed by the Northwest Evaluation Association, the MAP are state-aligned computerized adaptive tests that reflect the instructional level of each student and measure growth over time. The MAP is appropriate for students in grades 2 through 10. The untimed assessment typically features between 40 and 50 items. The assessment is usually tailored to the specific needs of individual organizations, but all NWEA MAP assessments draw from the same item bank. The Language Test draws items from the following areas: writing process, composition structure, grammar/usage, punctuation, and capitalization (as cited in Scientific Learning Corporation, 2007b).

Top

Appendix A3.1 Summary of study findings for all domains1

  Domain
  Alphabetics Reading fluency Comprehension General literacy achievement
Meets WWC evidence standards
Rouse & Krueger (2004) nr nr nr ind
Scientific Learning Corporation (2007a) ind nr nr ind
Meets WWC evidence standards with reservations
Beattie (2000) ind (+) (+) ind
Borman & Benson (2006) nr nr ind ind
Overbay & Baenen (2002) nr nr ind nr
Scientific Learning Corporation (2004a) nr nr (+) nr
Scientific Learning Corporation (2004b) nr nr + nr
Scientific Learning Corporation (2007b) nr nr ind ind
Rating of effectiveness No discernible effects Potentially positive effects Potentially positive effects No discernible effects

nr = no reported outcomes under this domain
+ = study finding was positive and statistically significant
(+) = study finding was positive and substantively important, but not statistically significant
ind = study finding was indeterminate; that is, neither substantively important nor statistically significant

1 This appendix reports findings considered for the effectiveness rating and the average improvement indices in each domain. More detailed information on findings for all measures within the domains and the constructs that factor into the domains can be found in Appendices A3.2A3.5.

Top

Appendix A3.2 Summary of study findings included in the rating for the alphabetics domain1

  Authors' findings
from the study
 
  Mean outcome2
(standard deviation)3
WWC calculations
Outcome measure Study sample4 Sample size
(students)
Fast ForWord®
group
Comparison group Mean difference5
(Fast ForWord®
– comparison)
Effect size6 Statistical significance7
(at α = 0.05)
Improvement index8
Scientific Learning Corporation, 2007a8,9
Queensland University Inventory of Literacy (QUIL) Ages 5–14 137 8.49
(2.31)
7.93
(2.58)
0.56 0.23 ns +9
Average for alphabetics (Scientific Learning Corporation, 2007a)9 0.23 ns +9
Beattie, 200010
WJ-R ACH (Letter-Word Identification) Ages 12–17 24 90.99
(21.29)
92.08
(13.15)
–1.09 –0.06 ns –2
WJ-R ACH (Word Attack) Ages 12–17 24 86.41
(14.34)
85.91
(12.87)
0.50 0.04 ns +1
WJ-R COG (Auditory Processing Cluster for Phonemic Awareness) Ages 12–17 24 82.58
(14.14)
85.66
(15.61)
–3.08 –0.20 ns –8
Wide Range Achievement Test–Third Edition (WRAT-3) (Spelling subtest) Ages 12–17 24 82.58
(15.10)
85.66
(13.13)
–3.08 –0.21 ns –8
Average for alphabetics (Beattie, 2000)11 –0.11 ns –4
Domain average for alphabetics across all studies9 0.06 na +2

ns = not statistically significant
na = not applicable
WJ-R = Woodcock-Johnson Psycho-Educational Battery–Revised

1 This appendix reports findings considered for the effectiveness rating and the average improvement indices for the alphabetics domain.
2 The intervention group values are the comparison group means plus the difference in means gains between the intervention and comparison groups.
3 The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
4 The Adolescent Literacy topic area reviews studies of interventions administered to students in grades 4–12 (or 9–18 years of age). For studies that include samples of students that span both the Adolescent Literacy (grades 4–12) and Beginning Reading (grades K–3) topic areas and cannot be disaggregated by grade level, the Adolescent Literacy topic area reviews any studies that include 5th-grade students or higher. For example, this appendix includes a combined sample of students aged 5–14 years (Scientific Learning Corporation, 2007a).
5 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group.
6 For an explanation of the effect size calculation, see WWC Procedures and Standards Handbook, Appendix B.
7 Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
8 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting favorable results for the intervention group.
9 Results for the early elementary school students (in 3rd grade or below) in this study are traditionally considered under the Beginning Reading topic area reviews; however, because there was no separate analysis for students in 3rd grade or below (grades covered by the Beginning Reading topic area) and 4th grade and above (areas covered by the Adolescent Literacy topic area), we report on the total sample of students here.
10 The level of statistical significance was reported by the study authors or, when necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate the statistical significance, see WWC Procedures and Standards Handbook, Appendix C for clustering and WWC Procedures and Standards Handbook, Appendix D for multiple comparisons. In the case of Beattie (2000), a correction for multiple comparisons was needed, so the significance level may differ from that reported in the original study. In the case of Scientific Learning Corporation (2007a), no corrections for clustering or multiple comparisons were needed.
11 The WWC-computed average effect sizes for each study and for the domain across studies are simple averages rounded to two decimal places. The average improvement indices are calculated from the average effect sizes.

Top

Appendix A3.3 Summary of study findings included in the rating for the reading fluency domain1

  Author's findings
from the study
 
  Mean outcome2
(standard deviation)3
WWC calculations
Outcome measure Study sample Sample size
(students)
Fast ForWord®
group
Comparison group Mean difference4
(Fast ForWord®
– comparison)
Effect size5 Statistical significance6
(at α = 0.05)
Improvement index7
Beattie, 20008
Gray Oral Reading Test–Third Edition (GORT-3) Ages
12–17
24 87.39
(16.47)
79.50
(17.74)
7.89 0.44 ns +17
Average for reading fluency (Beattie, 2000)9 0.44 ns +17

ns = not statistically significant

1 This appendix reports findings considered for the effectiveness rating and the average improvement indices for the reading fluency domain.
2 The intervention group values are the comparison group means plus the difference in means gains between the intervention and comparison groups.
3 The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
4 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group.
5 For an explanation of the effect size calculation, see WWC Procedures and Standards Handbook, Appendix B.
6 Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
7 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting favorable results for the intervention group.
8 The level of statistical significance was reported by the study authors or, when necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate the statistical significance, see WWC Procedures and Standards Handbook, Appendix C for clustering and WWC Procedures and Standards Handbook, Appendix D for multiple comparisons. In the case of Beattie (2000), no corrections for clustering and multiple comparisons were needed.
9 This row provides the study average, which in this instance is also the domain average. The domain improvement index is calculated from the average effect size.

Top

Appendix A3.4 Summary of study findings included in the rating for the comprehension domain1

  Authors' findings
from the study
 
  Mean outcome2
(standard deviation)3
WWC calculations
Outcome measure Study sample4 Sample size
(students)
Fast ForWord®
group
Comparison group Mean difference5
(Fast ForWord®
– comparison)
Effect size6 Statistical significance7
(at α = 0.05)
Improvement index8
Beattie, 20009
WJ-R ACH Passage Comprehension Ages 12–17 24 97.17
(13.78)
93.25
(11.30)
3.92 0.30 ns +12
Average for comprehension (Beattie, 2000)10 0.30 ns +12
Borman & Benson, 20068
CTBS/5 Terra Nova Reading NCE Scores Grade 7 188 36.99
(14.11)
34.03
(14.92)
2.96 0.20 ns +8
Average for comprehension (Borman & Benson, 2006)9 0.20 ns +8
Overbay & Baenen, 20028,11
North Carolina End of Grade Grades
4–8
710 154.37 155.37 –1.00 –0.14 ns –6
Average for comprehension (Overbay & Baenen, 2002)9 –0.14 ns –6
Scientific Learning Corporation, 2004a8
Ohio Proficiency Test, Reading Test Grade 4 91 210.60
(16.65)
205.10
(16.97)
5.50 0.32 ns +13
Average for comprehension (Scientific Learning Corporation, 2004a)9 0.32 ns +13
Scientific Learning Corporation, 2004b8,12
Gates–MacGinitie
Reading Test
Grades 2–8 293 30.39
(14.37)
25.00
(10.60)
5.39 0.39 Statistically
significant
+15
Average for comprehension (Scientific Learning Corporation, 2004b)9 0.39 Statistically
significant
+15
Scientific Learning Corporation, 2007b8,13
Measures of Academic Progress, Reading Test Grades 4–5 70 34.90
(32.00)
30.50
(30.10)
4.40 0.14 ns +6
Average for comprehension (Scientific Learning Corporation, 2007b)9 0.14 ns +6
Domain average for comprehension across all studies9 0.20 na +8

ns = not statistically significant
na = not applicable
WJ-R ACH = Woodcock-Johnson Psycho-Educational Battery–Revised, Tests of Achievement
CTBS/5 = Comprehensive Test of Basic Skills
NCE = Normal Curve Equivalent

1 This appendix reports findings considered for the effectiveness rating and the average improvement indices for the comprehension domain. Subgroup findings from the same studies are not included in these ratings, but are reported in Appendix A4.
2 The intervention and control group values for Scientific Learning Corporation (2007b) are the ANCOVA adjusted mean values calculated using pretest scores as the covariates. For all other studies in this domain, the intervention group values are the comparison group means plus the difference in means gains between the intervention and comparison groups.
3 The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
4 The Adolescent Literacy topic area reviews studies of interventions administered to students in grades 4–12 (or 9–18 years of age). For studies that include samples of students that span both the Adolescent Literacy (grades 4–12) and Beginning Reading (grades K–3) topic areas and cannot be disaggregated by grade level, the Adolescent Literacy topic area reviews any studies that include 5th-grade students or higher. For example, this appendix includes a combined sample of students from grades 2–8 (Scientific Learning Corporation, 2004b).
5 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group.
6 For an explanation of the effect size calculation, see WWC Procedures and Standards Handbook, Appendix B.
7 Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
8 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting favorable results for the intervention group.
9 The level of statistical significance was reported by the study authors or, when necessary, calculated by the WWC to correct for clustering within classrooms or schools (corrections for multiple comparisons were not done for findings not included in the overall intervention rating). For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate the statistical significance, see WWC Procedures and Standards Handbook, Appendix C. For the Fast ForWord® studies summarized here, no corrections for clustering or multiple comparisons were needed.
10 This row provides the study average, which in this instance is also the domain average. The WWC-computed domain average effect size is a simple average rounded to two decimal places. The domain improvement index is calculated from the average effect size.
11 This study reported the mean values for the outcome measure, not the standard deviations. The effect size for each grade was calculated through the F-statistics from the one way ANOVA reported in the study. The average effect size reported here is based on effect sizes that have been weighted by the sample size for each grade.
12 The means and standard deviations were aggregated across two intervention groups.
13 This study separately reported results for students in grades 3 and below and for students in grades 4 and above, along with aggregated results across all of the grade levels. Results for the second- and third-grade students in this study will be considered under the Beginning Reading topic area reviews.

Top

Appendix A3.5 Summary of study findings included in the rating for the general literacy achievement domain1

  Authors' findings
from the study
 
  Mean outcome2
(standard deviation)3
WWC calculations
Outcome measure Study sample4 Sample size
(students)
Fast ForWord®
group
Comparison group Mean difference5
(Fast ForWord®
– comparison)
Effect size6 Statistical significance7
(at α = 0.05)
Improvement index8
Rouse & Krueger, 20048
Success for All Assessment Grades 3–6 373 4.06
(1.40)
4.03
(1.33)
0.03 0.02 ns +1
CELF-3, Receptive Language Grades 3–6 86 31.70
(18.43)
31.01
(16.59)
0.69 0.04 ns +2
State Standardized Reading Test Grades 3–6 454 44.18
(24.79)
43.03
(24.01)
1.15 0.05 ns +2
Average for general literacy achievement (Rouse & Krueger, 2004)9 0.04 ns +1
Scientific Learning Corporation, 2007a8,9
CELF-4, Receptive Language Ages 5–14 137 91.00
(12.40)
88.40
(14.12)
2.60 0.19 ns +8
CELF-4, Expressive Language Ages 5–14 137 88.00
(12.40)
85.00
(13.29)
3.00 0.23 ns +9
Average for general literacy achievement (Scientific Learning Corporation, 2007a)9 0.21 ns +8
Beattie, 20008
CELF-3, Receptive Language Ages 12–17 24 86.08
(21.11)
86.83
(22.74)
–0.75 –0.03 ns –1
Average for general literacy achievement (Beattie, 2000)9 –0.03 ns –1
Borman & Benson, 200610
CTBS/5 Terra Nova Language
NCE Scores
Grade 7 188 40.52
(11.22)
40.14
(11.59)
0.38 0.03 ns +1
Average for general literacy achievement (Borman & Benson, 2006)11 0.03 ns +1
Scientific Learning Corporation, 2007b8,12
Measures of Academic Progress,
Language Test
Grades 4–5 70 31.10
(27.90)
26.80
(26.60)
4.30 0.16 ns +6
Average for general literacy achievement (Scientific Learning Corporation, 2007b)9 0.16 ns +6
Domain average for general literacy achievement across all studies9 0.08 na +3

ns = not statistically significant
na = not applicable
CELF-3 = Clinical Evaluation of Language Fundamentals–Third Edition
CELF-4 = Clinical Evaluation of Language Fundamentals–Fourth Edition
CTBS/5 = Comprehensive Test of Basic Skills
NCE = Normal Curve Equivalent

1 This appendix reports findings considered for the effectiveness rating and the average improvement indices for the general literacy achievement domain.
2 The intervention group values are the comparison group means plus the difference in means gains between the intervention and comparison groups tested immediately after the intervention for Beattie (2000) and Scientific Learning Corporation (2007a). The intervention group values are the comparison group means plus the regression-adjusted impacts for Borman and Benson (2006) and Rouse and Krueger (2004). The intervention and control group values for Scientific Learning Corporation (2007b) are the ANCOVA adjusted mean values calculated using pretest scores as the covariates.
3 The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
4 The Adolescent Literacy topic area reviews studies of interventions administered to students in grades 4–12 (or 9–18 years of age). For studies that include samples of students that span both the Adolescent Literacy (grades 4–12) and Beginning Reading (grades K–3) topic areas and cannot be disaggregated by grade level, the Adolescent Literacy topic area reviews any studies that include 5th-grade students or higher. For example, this appendix includes a combined sample of students from grades 3–6 (Rouse & Krueger, 2004) and students aged 5–14 years (Scientific Learning Corporation, 2007a).
5 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group.
6 For an explanation of the effect size calculation, see WWC Procedures and Standards Handbook, Appendix B.
7 Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
8 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting favorable results for the intervention group.
9 Results for the early elementary school students (in grades 3 and below) in this study are traditionally considered under the Beginning Reading topic area reviews; however, because there was no separate analysis for students in 3rd grade or below (grades covered by the Beginning Reading topic area) and 4th grade and above (areas covered by the Adolescent Literacy topic area), we report on the total sample of students here.
10 The level of statistical significance was reported by the study authors or, when necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate the statistical significance, see WWC Procedures and Standards Handbook, Appendix C for clustering and WWC Procedures and Standards Handbook, Appendix D for multiple comparisons. For all Fast ForWord® studies summarized here, except Beattie (2000) and Borman & Benson (2006), a correction for multiple comparisons was needed, so the significance levels may differ from those reported in the original study.
11 The WWC-computed average effect sizes for each study and for the domain across studies are simple averages rounded to two decimal places. The average improvement indices are calculated from the average effect sizes.
12 This study separately reported results for students in grades 3 and below and for students in grades 4 and above, along with aggregated results across all of the grade levels. Results for the 2nd- and 3rd-grade students in this study will be considered under the Beginning Reading topic area reviews.

Top

Appendix A4 Summary of subgroup findings for the comprehension domain1

  Authors' findings
from the study
 
  Mean outcome2
(standard deviation)3
WWC calculations
Outcome measure Study sample4 Sample size
(students)
Fast ForWord®
group
Comparison group Mean difference5
(Fast ForWord®
– comparison)
Effect size6 Statistical significance7
(at α = 0.05)
Improvement index8
Overbay & Baenen, 20029
North Carolina
End of Grade Test
Grade 4 114 148.39 150.90 –2.51 –0.35 ns –14
North Carolina
End of Grade Test
Grade 5 148 156.07 155.76 0.31 0.06 ns +2
North Carolina
End of Grade Test
Grade 6 78 149.80 151.59 –1.79 –0.23 ns –9
North Carolina
End of Grade Test
Grade 7 224 155.70 156.86 –1.16 –0.18 ns –7
North Carolina
End of Grade Test
Grade 8 146 157.70 158.18 –0.48 –0.08 ns –3
Scientific Learning Corporation, 2004b10,11
Gates–MacGinitie Reading Test Group 1
vs. control;
grades 2–8
162 30.70
(13.90)
25.00
(10.60)
5.70 0.43 Statistically
significant
+17
Gates–MacGinitie Reading Test Group 2
vs. control;
grades 2–8
168 30.10
(14.80)
25.00
(10.60)
5.10 0.36 ns +14
Gates–MacGinitie Reading Test Group 1
vs. control;
grade 4
56 26.90
(12.80)
23.20
(10.20)
3.70 0.31 ns +12
Gates–MacGinitie Reading Test Group 2
vs. control;
grade 4
67 27.00
(15.20)
23.20
(10.20)
3.80 0.28 ns +11
Gates–MacGinitie Reading Test Group 1
vs. control;
grade 5
103 35.40
(14.10)
30.60
(9.10)
4.80 0.34 ns +14
Gates–MacGinitie Reading Test Group 2
vs. control;
grade 5
83 34.90
(13.40)
30.60
(9.10)
4.30 0.33 ns +13

ns = not statistically significant

1 This appendix presents subgroup findings for measures that fall in the comprehension domain. Total group scores were used for rating purposes and are presented in Appendix A3.4.
2 The intervention group values are the comparison group means plus the difference in means gains between the intervention and comparison groups.
3 The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
4 The Adolescent Literacy topic area reviews studies of interventions administered to students in grades 4–12 (or 9–18 years of age). For studies that include samples of students that span both the Adolescent Literacy (grades 4–12) and Beginning Reading (grades K–3) topic areas and cannot be disaggregated by grade level, the Adolescent Literacy topic area reviews any studies that include 5th-grade students or higher. For example, this appendix includes a combined sample of students from grades 2–8 (Scientific Learning Corporation, 2004b).
5 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group.
6 For an explanation of the effect size calculation, see WWC Procedures and Standards Handbook, Appendix B.
7 Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
8 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
9 This study reported the mean values only for the outcome measure, not the standard deviations. The effect size was calculated through the F-statistics from the one way ANOVA reported in the study.
10 Treatment group 1 received the intervention from September to November, and treatment group 2 received the intervention from December to February.
11 The level of statistical significance was reported by the study authors or, when necessary, calculated by the WWC to correct for clustering within classrooms or schools (corrections for multiple comparisons were not done for findings not included in the overall intervention rating). For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate the statistical significance, see WWC Procedures and Standards Handbook, Appendix C. For the Fast ForWord® studies summarized here, no corrections for clustering were needed.

Top

Appendix A5.1 Fast ForWord® rating for the alphabetics domain

The WWC rates an intervention’s effects for a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1

For the outcome domain of alphabetics, the WWC rated Fast ForWord® as having no discernible effects for adolescent learners. It did not meet the criteria for positive effects, potentially positive effects, mixed effects, potentially negative effects, or negative effects because no studies showed statistically significant or substantively important effects, either positive or negative.

Rating received
No discernible effects: No affirmative evidence of effects.
  • Criterion 1: No studies showing a statistically significant or substantively important effect, either positive or negative.

    Met. No studies showed a statistically significant or substantively important effect, either positive or negative. Two studies showed indeterminate effects.

Other ratings considered

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design.

    Not met. No studies showed a statistically significant positive effect.

    AND

  • Criterion 2: No studies showing statistically significant or substantively important negative effects.

    Met. No studies showed a statistically significant or substantively important negative effect.

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect.

    Not met. No studies showed a statistically significant or substantively important positive effect.

    AND

  • Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate effects than showing statistically significant or substantively important positive effects.

    Not met. No studies showed a statistically significant or substantively important negative effect. Two studies showed indeterminate effects, and no studies showed a statistically significant or substantively important positive effect.

Mixed effects: Evidence of inconsistent effects as demonstrated through either of the following criteria.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect, and at least one study showing a statistically significant or substantively important negative effect, but no more such studies than the number showing a statistically significant or substantively important positive effect.

    Not met. No studies showed a statistically significant or substantively important positive effect, and no studies showed a statistically significant or substantively important negative effect.

    OR

  • Criterion 2: At least one study showing a statistically significant or substantively important effect, and more studies showing an indeterminate effect than showing a statistically significant or substantively important effect.

    Not met. No studies showed a statistically significant or substantively important effect.

Potentially negative effects: Evidence of a negative effect with no overriding contrary evidence.

  • Criterion 1: One study showing a statistically significant or substantively important negative effect and no studies showing a statistically significant or substantively important positive effect.

    Not met. No studies showed a statistically significant or substantively important effect, either positive or negative.

    OR

  • Criterion 2: Two or more studies showing statistically significant or substantively important negative effects, at least one study showing a statistically significant or substantively important positive effect, and more studies showing statistically significant or substantively important negative effects than showing statistically significant or substantively important positive effects.

    Not met. No studies showed a statistically significant or substantively important effect, either positive or negative.

Negative effects: Strong evidence of a negative effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant negative effects, at least one of which met WWC evidence standards for a strong design.

    Not met. No studies showed a statistically significant negative effect.

    AND

  • Criterion 2: No studies showing statistically significant or substantively important positive effects.

    Met. No studies showed a statistically significant or substantively important positive effect.

1 For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Procedures and Standards Handbook, Appendix E.

Top

Appendix A5.2 Fast ForWord® rating for the reading fluency domain

The WWC rates an intervention’s effects for a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1

For the outcome domain of reading fluency, the WWC rated Fast ForWord® as having potentially positive effects for adolescent learners. It did not meet the criteria for positive effects because no studies showed statistically significant positive effects. The remaining ratings (mixed effects, no discernible effects, potentially negative effects, or negative effects) were not considered, as Fast ForWord® was assigned the highest applicable rating.

Rating received

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect.

    Met. One study showed a substantively important positive effect.

    AND

  • Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate effects than showing statistically significant or substantively important positive effects.

    Met. No studies showed a statistically significant or substantively important negative effect, no studies showed indeterminate effects, and one study showed substantively important positive effects.

Other ratings considered

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design.

    Not met. No studies showed statistically significant positive effects.

    AND

  • Criterion 2: No studies showing statistically significant or substantively important negative effects.

    Met. No studies showed a statistically significant or substantively important negative effect.

 
1 For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Procedures and Standards Handbook, Appendix E.

Top

Appendix A5.3 Fast ForWord® rating for the comprehension domain

The WWC rates an intervention’s effects for a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1

For the outcome domain of comprehension, the WWC rated Fast ForWord® as having potentially positive effects for adolescent learners. It did not meet the criteria for positive effects because only one study showed statistically significant positive effects. The remaining ratings (mixed effects, no discernible effects, potentially negative effects, or negative effects) were not considered, as Fast ForWord® was assigned the highest applicable rating.

Rating received

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect.

    Met. One study showed a statistically significant positive effect, and two studies showed substantively important positive effects.

    AND

  • Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate effects than showing statistically significant or substantively important positive effects.

    Met. No studies showed a statistically significant or substantively important negative effect. Three studies showed indeterminate effects, one study showed a statistically significant positive effect, and two studies showed substantively important positive effects.

Other ratings considered

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design.

    Not met. Only one study showed a statistically significant positive effect.

    AND

  • Criterion 2: No studies showing statistically significant or substantively important negative effects.

    Met. No studies showed a statistically significant or substantively important negative effect.

 
1 For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Procedures and Standards Handbook, Appendix E.

Top

Appendix A5.4 Fast ForWord® rating for the general literacy achievement domain

The WWC rates an intervention’s effects for a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1

For the outcome domain of general literacy achievement, the WWC rated Fast ForWord® as having no discernible effects for adolescent learners. It did not meet the criteria for positive effects, potentially positive effects, mixed effects, potentially negative effects, or negative effects because no studies showed statistically significant or substantively important effects, either positive or negative.

Rating received
No discernible effects: No affirmative evidence of effects.
  • Criterion 1: No studies showing a statistically significant or substantively important effect, either positive or negative.

    Met. No studies showed a statistically significant or substantively important effect, either positive or negative. Five studies showed indeterminate effects.

Other ratings considered

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design.

    Not met. No studies showed a statistically significant positive effect.

    AND

  • Criterion 2: No studies showing statistically significant or substantively important negative effects.

    Met. No studies showed a statistically significant or substantively important negative effect.

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect.

    Not met. No studies showed a statistically significant or substantively important positive effect.

    AND

  • Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate effects than showing statistically significant or substantively important positive effects.

    Not met. No studies showed a statistically significant or substantively important negative effect. Five studies showed indeterminate effects, and no studies showed a statistically significant or substantively important positive effect.

Mixed effects: Evidence of inconsistent effects as demonstrated through either of the following criteria.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect, and at least one study showing a statistically significant or substantively important negative effect, but no more such studies than the number showing a statistically significant or substantively important positive effect.

    Not met. No studies showed a statistically significant or substantively important negative effect.

    OR

  • Criterion 2: At least one study showing a statistically significant or substantively important effect, and more studies showing an indeterminate effect than showing a statistically significant or substantively important effect.

    Not met. No studies showed a statistically significant or substantively important effect.

Potentially negative effects: Evidence of a negative effect with no overriding contrary evidence.

  • Criterion 1: One study showing a statistically significant or substantively important negative effect and no studies showing a statistically significant or substantively important positive effect.

    Not met. No studies showed a statistically significant or substantively important effect, either positive or negative.

    OR

  • Criterion 2: Two or more studies showing statistically significant or substantively important negative effects, at least one study showing a statistically significant or substantively important positive effect, and more studies showing statistically significant or substantively important negative effects than showing statistically significant or substantively important positive effects.

    Not met. No studies showed a statistically significant or substantively important effect, either positive or negative.

Negative effects: Strong evidence of a negative effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant negative effects, at least one of which met WWC evidence standards for a strong design.

    Not met. No studies showed a statistically significant negative effect.

    AND

  • Criterion 2: No studies showing statistically significant or substantively important positive effects.

    Met. No studies showed a statistically significant or substantively important positive effect.

1 For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Procedures and Standards Handbook, Appendix E.

Top

Appendix A6 Extent of evidence by domain

  Sample size
Outcome domain Number of studies Schools Students Extent of evidence1
Alphabetics 2 7 161 Small
Reading fluency 1 3 24 Small
Comprehension 6 >422 1,376 Medium to large
General literacy achievement 5 20 8733 Medium to large
1 A rating of “medium to large” requires at least two studies and two schools across studies in one domain and a total sample size across studies of at least 350 students or 14 classrooms. Otherwise, the rating is “small.” For more details on the extent of evidence categorization, see the WWC Procedures and Standards Handbook, Appendix G.
2 The number of control schools in Overbay and Baenen (2002) is unknown.
3 For Rouse and Krueger (2004), we counted the number of students as 454, which is from the state assessment. The actual number of students might be higher, as we do not know to what extent the number of students from the three outcome measures overlapped.

Top