Skip Navigation

What Works Clearinghouse


Appendices


Appendix A1.1 Study Characteristics: Macaruso, Hook, & McCabe, 2006

Characteristic Description
Study citation Macaruso, P., Hook, P. E., & McCabe, R. (2006). The efficacy of computer-based supplementary phonics programs for advancing reading skills in at-risk elementary students. Journal of Research in Reading, 29(2), 162–172.
Participants Study participants were first-graders in 10 classrooms spread across five schools, with two classrooms in each school (one treatment classroom and one comparison classroom) participating in the study. The study initially included 92 intervention and 87 comparison students. Twelve students (9 intervention, 3 comparison) left the study when it was determined that they were eligible for special education services. The analysis sample contained 15 Title I students in each of the intervention and comparison groups (Title I students received an additional 30 minutes of academic instruction per day from a Title I staff member).
Setting First-grade classrooms in a Massachusetts public school district.
Intervention Lexia Reading is a computerized, supplementary reading software program designed for regular use, consisting of two to four weekly sessions of 20 to 30 minutes each, in a lab or classroom setting. In the study, intervention students were exposed to two Lexia Reading components: Phonics Based Reading (PBR) and Strategies for Older Students (SOS). The PBR component has 3 levels, 17 skill activities, and 174 units covering basic phonics skills usually taught in grades 1 through 3. After finishing PBR activities, children were introduced to SOS activities, which consist of 5 levels, 24 skill activities, and 369 discrete units. Intervention classes used Lexia Reading software for approximately six months, with children completing an average of 64 sessions and 140 skill units. Most students worked on PBR activities only; 14 students (17%) in the intervention programs moved on to SOS activities, working mainly on early levels.
Comparison Students in the comparison group classrooms received regular classroom instruction while intervention group classrooms were participating in the Lexia Reading program.
Primary outcomes and measurement For both pre- and posttest, the authors used the Gates-MacGintie Reading Test, Level BR to assess reading performance. For a more detailed description of this outcome measure and its subtests, see Appendices A2.1, A2.3, and A2.4.
Staff/teacher training Teachers in intervention classrooms had an average of 19 years of teaching experience, and teachers in comparison classrooms had an average of 18 years of teaching experience. Teachers in the intervention classrooms and computer lab staff received orientation and training sessions for implementing Lexia Reading software use.

Top

Appendix A1.2 Study Characteristics: Gale, 2006

Outcome measure Description
Study citation Gale, D. (2006). The effect of computer-delivered phonological awareness training on the early literacy skills of students identified as at-risk for reading failure. Retrieved from the University of South Florida website: http://purl.fcla.edu/usf/dc/et/SFE0001531.
Participants Kindergarten and first-grade students who were identified in the fall assessment period as needing intensive substantial intervention based on their performance on the Dynamic Indicators of Basic Early Literacy Skills (DIBELS) were recruited for this study. Forty-one kindergarten students and 38 first-grade students were randomly assigned to one of three groups: (1) Lexia Early Reading, (2) Earobics® Step 1, or (3) control. After attrition, the analysis sample contained 39 kindergarten and 37 first-grade students.
Setting The elementary school in which this study occurred is located in a large school district in the southwest region of Florida serving approximately 114,500 pre-K to twelfth-grade students. The elementary school had a total kindergarten through fifth-grade student enrollment of 722.  Students in the school represented the following ethnic groups: 60% Caucasian, 19% Hispanic, 8% Asian/Pacific Islander, 7% African-American, 5% multiracial, <1% American Indian/Alaskan Native. Approximately three-quarters of the students in this school were eligible for free or reduced-priced lunch.
Intervention A rotation schedule was developed by the researcher based on teacher input. The two phonological awareness software programs were loaded on 14 numbered computers with headphones in the computer lab at the elementary school. Each student was assigned to a computer to use throughout the intervention period. Before the intervention period began, the researcher trained the participants in small groups of five on the relevant intervention software (Lexia Early Reading or Earobics® Step 1) with regard to initiating and proceeding through the program and navigating the mouse. Students were required to pass at least five out of six areas on the training checklist as well as the task "use mouse to navigate activity" before beginning the intervention. The students were divided into four groups that alternated in the computer lab according to the rotation schedule. The researcher and a teacher assistant monitored the students each day during their training in the computer lab. Students used their respective computer programs in the school computer lab 20 minutes daily for 25 days, resulting in a total of 8 hours 20 minutes of exposure.
Comparison The control group received no reading instruction beyond the regular language arts time. Typical reading instruction in the school was a 90-minute reading block.
Primary outcomes and measurement Students were tested before and after the intervention using the DIBELS subtests for Initial Sounds Fluency (kindergarten only), Letter Naming Fluency, Phoneme Segmentation Fluency, Nonsense Word Fluency (first grade only) and Oral Reading Fluency (first grade only). For a more detailed description of these outcome measures, see Appendices A2.1 and A2.2.
Staff/teacher training No information on teacher training was provided. The Lexia Early Reading group worked in a computer lab, with minimal teacher instruction.

Top

APpendix A1.3 Study Characteristics: Macaruso & Walker, 2008  

Outcome measure Description
Study citation Macaruso, P., & Walker, A. (2008). The efficacy of computer-assisted instruction for advancing literacy skills in kindergarten children. Reading Psychology, 29(3), 266–287.  
Participants Six kindergarten classes from two elementary schools participated in the study. The six classes included morning and afternoon classes for each of three teachers. The authors randomly assigned the six classes to treatment (Lexia Early Reading) or comparison (extra time spent in language-related classroom activities), blocked by teacher. These six classes included a total of 94 students. After randomly assigning classrooms, the authors dropped from the analysis 11 students (9 intervention, 2 comparison) who were designated as English Language Learners or special education. At the end of the study, the authors excluded another 12 students from the treatment group who had not completed their minimum criterion of more than 45 sessions with Lexia Early Reading. The final analysis sample consisted of 26 students in the Lexia Early Reading group and 45 students in the comparison group. The authors demonstrated that there were no statistically significant pre-intervention differences between the two analysis groups on the baseline measures (DIBELS: Initial Sounds Fluency and DIBELS: Letter Naming Fluency).
Setting The participating schools were two urban elementary schools near Boston, Massachusetts. Twenty-nine percent of families in the school system spoke a language other than English at home, and the median household income in the school district was $37,000 (compared to a state median of $50,000). More than half of the students in the district qualified for free or reduced-price lunch.
Intervention Classes in the intervention condition began using Lexia Early Reading in November and continued for approximately six months. Students used the software in two to three weekly sessions of 15 to 20 minutes each. On average, students in the analysis sample completed 52 sessions with the software. Lexia Early Reading contains nine activities involving sound identification, rhyming, segmenting and blending of sounds, and application of letter-sound correspondences for subsets of consonants and vowels. Each activity consists of several units; students progress to the next activity only after mastering skills in the prior activity.
Comparison Students in the comparison condition spent extra time engaged in language-related classroom activities.
Primary outcomes and measurement At the end of the study period, the students were tested using the DIBELS subtests for Letter Naming Fluency and Phoneme Segmentation Fluency, as well as both subtest and overall composite scores for the Gates-MacGintie Reading Test, Level PR. Because the composite score for the Gates-MacGintie measure spans the alphabetics and comprehension domains, the subtests results for alphabetics and comprehension are presented as the main findings in Appendices A3.1 and A3.3 and the composite score results are presented as supplemental findings in Appendix A4.3. For a more detailed description of this outcome measure and its subtests, see Appendices A2.1 and A2.3.
Staff/teacher training Kindergarten teachers and computer lab staff participated in an orientation and training session for Lexia Early Reading software implementation.

Appendix A2.1 Outcome measures for the alphabetics domain

Characteristic Description
Phonological Awareness
Gates-MacGintie Reading Test, Level BR: Letter-Sound Correspondences subtest Students are required to match letters with their appropriate sounds. It is one of four subtests on the Gates-MacGintie Reading Test, Level BR (as cited in Macaruso, Hook, & McCabe, 2006).
Dynamic Indicators of Basic Early Literacy Skills (DIBELS): Initial Sounds Fluency subtest Students are presented with four pictures that are named by the examiner. The examiner then asks the student to identify the picture that begins with a sound presented orally by the examiner. The student is also asked to orally provide the initial sound in a word presented orally by the examiner. The score is calculated by totaling the amount of time that it takes the student to identify or produce the correct sounds and converting that time into the number of correct onsets in a minute (as cited in Gale, 2006).
Dynamic Indicators of Basic Early Literacy Skills (DIBELS): Phoneme Segmentation Fluency subtest In this task, the student is given a word and asked to provide the individual phonemes that make up the word. Words are continuously presented for one minute. The score is calculated by how many phonemes the student correctly segments in one minute (as cited in Gale, 2006).

Gates-MacGintie Reading Test, Level PR: Oral Language Concepts subtest

This subtest requires children to identify pictures with names that begin or end with the same sound or identify pictures that have rhyming names (as cited in Macaruso & Walker, 2008).

Letter Knowledge
Dynamic Indicators of Basic Early Literacy Skills (DIBELS): Letter Naming Fluency subtest This task requires the student to orally identify upper- and lowercase letters presented in random order on a piece of paper. The student names as many letters (out of 120) as he or she can in one minute with the examiner providing the name if the student hesitates for three seconds. The score is calculated by the number of correctly named letters in one minute (as cited in Gale, 2006).
Gates-MacGintie Reading Test, Level PR: Letters and Letter-Sound Correspondences subtest Children identify when two letters match and match letters with pictures that begin with sounds corresponding to the letters (as cited in Macaruso & Walker, 2008).
Phonics
Dynamic Indicators of Basic Early Literacy Skills (DIBELS): Nonsense Words Fluency subtest In this measure, the student is presented with randomly ordered vowel-consonant and consonant-vowel-consonant nonsense words on a sheet of paper and asked to produce either the individual sounds or the whole nonsense word. The child has one minute to produce as many letter sounds or words as he or she can (as cited in Gale, 2006).
Print Awareness
Gates-MacGintie Reading Test, Level PR: Literacy Concepts subtest This subtest assesses students’ basic knowledge of printed text, such as finding the first letter in a word (as cited in Macaruso & Walker, 2008).

Appendix A2.2 Outcome measures for the fluency domain

Outcome measure Description
Dynamic Indicators of Basic Early Literacy Skills (DIBELS): Oral Reading Fluency Oral Reading Fluency is a measure of accuracy and fluency with connected text. Students are presented with a passage calibrated at their grade level and asked to read aloud for one minute. Scoring is based on mispronunciations, omissions, substitutions, and hesitations (as cited in Gale, 2006).

Appendix A2.3 Outcome measures for the comprehension domain

Outcome measure Description
Vocabulary Development
Gates-MacGintie Reading Test, Level BR: Basic Story Words Subtest Children are tested on their ability to recognize words that appear most commonly in written text and do not require decoding. It is one of four subtests on the Gates-MacGintie Reading Test, Level BR (as cited in Macaruso, Hook, & McCabe, 2006).
Reading Comprehension
Gates-MacGintie Reading Test, Level PR: Listening Comprehension subtest This subtest asks children to listen to a passage and select a picture that most closely reflects the meaning of the passage (as cited in Macaruso & Walker, 2008).

Appendix A2.4 Outcome measures for the general reading achievement domain

Outcome measure Description
Gates-MacGintie Reading Test, Level BR: Form S This test contains four subtests: (1) letter-sound correspondences for initial consonants and consonant clusters, (2) letter-sound correspondences for final consonants and consonant clusters, (3) letter-sound correspondences for vowels, and (4) recognizing basic story words (as cited in Macaruso, Hook, & McCabe, 2006).

Appendix A3.1 Summary of study findings included in the rating for the alphabetics domain1

  Authors' findings
from the study
 
  Mean outcome
(standard deviation)2
WWC calculations
Outcome measure Study sample Sample size (clusters/students) Lexia Reading
group
Comparison group Mean difference3 (Lexia Reading–comparison) Effect size4 Statistical significance5
(at α = 0.05)
Improvement index6
Gale, 20067,8
Comparison #1: Lexia Early Reading vs. Control
Construct: Phonological Awareness

DIBELS: Initial
Sounds Fluency

Kindergarten

26

10.07
(5.01)

5.21
(3.00)

4.86

1.14

Statistically significant

+37

DIBELS: Phoneme Segmentation Fluency

Kindergarten

26

1.319
(0.63)
0.0010
(0.00)
1.31
2.85
Statistically significant
+50
DIBELS: Phoneme Segmentation Fluency

Grade 1

24

37.66
(13.71)
31.02
(10.57)
6.64
0.52
ns
+20
Construct: Letter Knowledge

DIBELS: Letter
Naming Fluency

Kindergarten

26

16.929
(12.91)

13.0810
(10.00)

3.84

0.32

ns

+13

DIBELS: Letter
Naming Fluency

Grade 1

24

48.11
(14.33)
38.02
(8.97)
10.09
0.81
ns
+29
Construct: Phonics
DIBELS: Nonsense
Word Fluency
Grade 1
24
40.87
(15.12)
26.11
(11.44)
14.76
1.06
ns
+36
Average for alphabetics, Comparison #1 (Gale, 2006)11 1.12 Statistically significant +37
Comparison #2: Lexia Early Reading vs. Earobics®
Construct: Phonological Awareness

DIBELS: Initial
Sounds Fluency

Kindergarten

26

10.07
(5.01)

13.72
(4.61)

–3.65

–0.73

ns

–27

DIBELS: Phoneme Segmentation Fluency

Kindergarten

26

1.319
(0.63)
1.3110
(0.75)
0.00
0.00
ns
0
DIBELS: Phoneme Segmentation Fluency

Grade 1

25

37.66
(13.71)
47.75
(8.08)
–10.09
–0.88
ns
–31
Construct: Letter Knowledge

DIBELS: Letter
Naming Fluency

Kindergarten

26

18.319
(12.91)

21.0810
(11.74)

–2.77

–0.22

ns

–9

DIBELS: Letter
Naming Fluency

Grade 1

25

48.11
(14.33)
50.26
(13.83)
–2.15
–0.15
ns
–6
Construct: Phonics
DIBELS: Nonsense
Word Fluency
Grade 1
25
40.87
(15.12)
47.72
(19.65)
–6.85
–0.38
ns
–15
Average for alphabetics, Comparison #2 (Gale, 2006)11 0.39 ns 15
Average for alphabetics, Entire study (Gale, 2006)11 0.36 ns +14
Macaruso & Walker, 20087,12
Construct: Phonological Awareness
DIBELS: Phoneme
Segmentation Fluency
Kindergarten 6/71 28.00
(13.30)
30.90
(19.10)
–2.90 –0.17 ns –7
Gates-MacGintie: Oral Language Concepts Kindergarten 6/71 14.80
(4.00)
12.80
(3.50)
2.00 0.54 ns +20
Construct: Letter Knowledge
DIBELS: Letter
Naming Fluency
Kindergarten 6/71 38.30
(16.90)
38.50
(17.00)
–0.20 –0.01 ns 0
Gates-MacGintie: Letters and Letter-Sound Correspondences Kindergarten 6/71 24.70
(4.50)
23.70
(5.40)
1.00 0.19 ns +8
Construct: Print Awareness
Gates-MacGintie: Literacy Concepts Kindergarten 6/71 16.80
(2.80)
15.70
(3.00)
1.10 0.37 ns +14
Average for alphabetics, (Macaruso & Walker, 2008)11 0.189 ns +79
Domain average for alphabetics across all studies11 0.27 na +11

ns = not statistically significant
na = not applicable

1This appendix reports findings considered for the effectiveness rating and the average improvement indices for the alphabetics domain. Subtest and subgroup findings from Macaruso and Walker (2008) are not included in these ratings but are reported in Appendix A4.1.
2The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
3Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group.
4For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations.
5Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
6The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
7The level of statistical significance was reported by the study authors or, when necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the case of Gale (2006), corrections for multiple comparisons were needed, and in the case of Macaruso & Walker (2008), corrections for clustering and multiple comparisons were needed, so the significance levels may differ from those reported in the original studies.
8Unless otherwise noted, means from this study are posttest means, ANCOVA-adjusted for pretest differences, as reported in Gale (2006).
9The Lexia Reading group mean equals the comparison group mean plus the mean difference. The study author did not provide adjusted means for this outcome, so the WWC calculated the mean difference in outcomes, taking into account the pretest difference between the study groups. For further details, please see Technical Details of WWC-Conducted Computations.
10Unadjusted posttest mean as reported in Gale (2006).
11The WWC-computed average effect sizes for each study and for the domain across studies are simple averages rounded to two decimal places. The average improvement indices are calculated from the average effect sizes.
12In this study, the authors did an ANCOVA-adjustment for pretest scores when calculating statistical significance but presented raw means and standard deviations.

Top

Appendix A3.2 Summary of study findings included in the rating for the fluency domain1

  Authors' findings
from the study
 
  Mean outcome2
(standard deviation)3
WWC calculations
Outcome measure Study sample Sample size (students) Lexia Reading
group
Comparison group Mean difference4
(Lexia Reading–comparison)
Effect size5 Statistical significance6
(at α = 0.05)
Improvement index7
Gale, 20068
Comparison #1: Lexia Early Reading vs. Control
DIBELS: Oral Reading Fluency Grade 1 24 21.31
(9.65)
13.81
(7.83)
7.50 0.82 ns +30
Comparison #2: Lexia Early Reading vs. Earobics®
DIBELS: Oral Reading Fluency Grade 1 25 21.31
(9.65)
27.35
(18.53)
–6.04 –0.39 ns –15
Domain average for fluency9 0.22 ns +9

ns = not statistically significant

1This appendix reports findings considered for the effectiveness rating and the average improvement indices for the fluency domain.
2Means are posttest means, ANCOVA-adjusted for pretest differences, as reported in Gale (2006).
3The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
4Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group.
5For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations.
6Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
7The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
8The level of statistical significance was reported by the study authors or, when necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the case of Gale (2006), a correction for multiple comparisons was needed, so the significance levels may differ from those reported in the original study.
9This row provides the study average, which in this instance, is also the domain average. The WWC-computed domain average effect size is a simple average rounded to two decimal places. The domain improvement index is calculated from the average effect size.

Top

Appendix A3.3 Summary of study findings included in the rating for the comprehension domain1

  Authors' findings
from the study
 
  Mean outcome
(standard deviation)2
WWC calculations
Outcome measure Study sample Sample size (clusters/students) Lexia Reading
group
Comparison group

Mean difference3
(Lexia Reading–comparison)

Effect size4 Statistical significance5
(at α = 0.05)
Improvement index6
Macaruso & Walker, 20087,8
Construct: Reading Comprehension
Gate-MacGintie: Listening Comprehension Kindergarten 6/71 13.60
(3.80)
12.60
(3.50)
1.00 0.27 ns +11
Domain average for comprehension9 0.27 ns +11

ns = not statistically significant

1This appendix reports findings considered for the effectiveness rating and the average improvement indices for the comprehension domain. Subtest and subgroup findings from Macaruso and Walker (2008) are not included in these ratings, but are reported in Appendix A4.2.
2The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
3Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group.
4For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations.
5Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
6The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
7The level of statistical significance was reported by the study authors or, when necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the case of Macaruso and Walker (2008), corrections for clustering and multiple comparisons were needed, so the significance level may differ from that reported in the original study.
8In this study, the authors did an ANCOVA-adjustment for pretest scores when calculating statistical significance but presented raw means and standard deviations.
9This row provides the study average, which in this instance, is also the domain average. The WWC-computed domain average effect size is a simple average rounded to two decimal places. The domain improvement index is calculated from the average effect size.

Top

Appendix A3.4 Summary of study findings included in the rating for the general reading achievement domain1

  Authors' findings
from the study
 
  Mean outcome2
(standard deviation)3
WWC calculations
Outcome measure Study sample Sample size (clusters/students) Lexia Reading
group
Comparison group Mean difference4 (Lexia Reading–comparison) Effect size5 Statistical significance6
(at α = 0.05)
Improvement index7
Macaruso, Hook, & McCabe, 20068
Gates-MacGintie Reading Test, Level BR: Form S Grade 1 10/167 63.70
(14.10)
60.40
(14.10)
3.30 0.23 ns +9
Domain average for general reading achievement9 0.23 ns +9

ns = not statistically significant

1This appendix reports findings considered for the effectiveness rating and the average improvement indices for the general reading achievement domain. Subtest and subgroup findings from Macaruso, Hook, and McCabe (2006) are not included in these ratings, but are reported in Appendix A4.3.
2Means and standard deviations are ANCOVA-adjusted for pretest differences, as reported in communication with the author.
3The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
4Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group.
5For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations.
6Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
7The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
8The level of statistical significance was reported by the study authors or, when necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the case of Macaruso, Hook, and McCabe (2006), a correction for clustering was needed, so the significance levels may differ from those reported in the original study.
9This row provides the study average, which in this instance, is also the domain average. The WWC-computed domain average effect size is a simple average rounded to two decimal places. The domain improvement index is calculated from the average effect size.

Top

Appendix A4.1 Summary of subscale and subgroup findings for the alphabetics domain1

  Authors' findings
from the study
 
  Mean outcome
(standard deviation)2
WWC calculations
Outcome measure Study sample Sample size (clusters/students) Lexia Reading
group
Comparison group Mean difference3 (Lexia Reading–comparison) Effect size4 Statistical significance5
(at α = 0.05)
Improvement index6
Macaruso, Hook, & McCabe, 20067,8,9

Gates-MacGintie Reading Test: Letter- Sound Correspondences

Grade 1: Title I Students

10/30

39.80
(5.50)

34.80
(5.50)

5.00

0.88

Statistically significant

+31

Macaruso & Walker, 20088,10
Construct: Phonological Awareness
DIBELS: Phoneme Segmentation Fluency
Kindergarten: Low Performers
6/24
29.00
(11.00)
28.00
(21.20)
1.00
0.06
ns
+2
Gates-MacGintie: Oral Language Concepts
Kindergarten: Low Performers
6/24
16.00
(2.20)
12.40
(3.60)
3.60
1.17
Statistically significant
+38
Construct: Letter Knowledge
DIBELS: Letter
Naming Fluency
Kindergarten: Low Performers 6/24 39.20
(12.40)
38.40
(12.70)
0.80 0.06 ns +2
Gates-MacGintie: Letters and Letter-Sound Correspondences Kindergarten: Low Performers 6/24 25.60
(2.60)
22.30
(5.40)
3.30 0.75 ns +27
Construct: Print Awareness
Gates-MacGintie: Literacy Concepts Kindergarten: Low Performers 6/24 17.10
(2.50)
15.30
(2.90)
1.80 0.64 ns +24

ns = not statistically significant

1This appendix presents subscale and subgroup findings for measures that fall in alphabetics. Total group (for Macaruso & Walker, 2008) and total scale (for Macaruso, Hook, & McCabe, 2006) scores were used for rating purposes and are presented in Appendices A3.1 and A3.4, respectively.
2The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
3Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group.
4For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations.
5Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
6The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
7Means and standard deviations are ANCOVA-adjusted for pretest differences, as reported in communication with the author.
8The level of statistical significance was reported by the study authors or, when necessary, calculated by the WWC to correct for clustering within classrooms or schools (corrections for multiple comparisons were not done for findings not included in the overall rating). For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the cases of Macaruso, Hook, and McCabe (2006) and Macaruso and Walker (2008), a correction for clustering was needed, so the significance levels may differ from those reported in the original study.
9The study did not include specific information on the number of clusters (classrooms) across which the sub-sample of Title I students were distributed. The WWC assumed 10 classrooms, the number of classes in the full sample. With any fewer than 10 classrooms, however, this comparison would no longer be statistically significant.
10In this study, the authors did an ANCOVA-adjustment for pretest scores when calculating statistical significance but presented raw means and standard deviations.

Top

Appendix A4.2 Summary of subscale and subgroup findings for the comprehension domain1

  Authors' findings
from the study
 
  Mean outcome
(standard deviation)2
WWC calculations
Outcome measure Study sample Sample size (clusters/students) Lexia Reading
group
Comparison group Mean difference3 (Lexia Reading–comparison) Effect size4 Statistical significance5
(at α = 0.05)
Improvement index6
Macaruso, Hook, & McCabe, 20067,8,9
Construct: Vocabulary Development

Gates-MacGintie: Basic Story Words Subtest

Grade 1: Title I Students

10/30

23.30
(3.50)

21.50
(3.50)

1.80

0.50

ns

+19

Macaruso & Walker, 20088,10
Construct: Reading Comprehension
Gates-MacGintie: Listening Comprehension
Kindergarten: Low Performers
6/24
13.40
(4.10)
11.50
(3.60)
1.90
0.48
ns
+18

ns = not statistically significant

1This appendix presents subscale and subgroup findings for measures that fall in comprehension. Total group (for Macaruso & Walker, 2008) and total scale (for Macaruso, Hook, & McCabe, 2006) scores were used for rating purposes and are presented in Appendices A3.1 and A3.4, respectively.
2The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
3Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group.
4For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations.
5Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
6The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
7Means and standard deviations are ANCOVA-adjusted for pretest differences, as reported in communication with the author.
8The level of statistical significance was reported by the study authors or, when necessary, calculated by the WWC to correct for clustering within classrooms or schools (corrections for multiple comparisons were not done for findings not included in the overall rating). For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the cases of Macaruso, Hook, and McCabe (2006) and Macaruso and Walker (2008), a correction for clustering was needed, so the significance levels may differ from those reported in the original study.
9The study did not include specific information on the number of clusters (classrooms) across which the sub-sample of Title I students were distributed. The WWC assumed 10 classrooms, the number of classes in the full sample. With any fewer than 10 classrooms, however, this comparison would no longer be statistically significant.
10In this study, the authors did an ANCOVA-adjustment for pretest scores when calculating statistical significance but presented raw means and standard deviations.

Top

Appendix A4.3 Summary of subgroup and composite score findings for the general reading achievement domain1

  Authors' findings
from the study
 
  Mean outcome
(standard deviation)2
WWC calculations
Outcome measure Study sample Sample size (clusters/students) Lexia Reading
group
Comparison group Mean difference3 (Lexia Reading–comparison) Effect size4 Statistical significance5
(at α = 0.05)
Improvement index6
Macaruso, Hook, & McCabe, 20067,8,9
Gates-MacGintie Reading Test, Level BR:
Form S
Grade 1: Title I Students 10/30 62.10
(13.70)
49.70
(13.70)
12.40 0.88 Statistically significant +31
Macaruso & Walker, 20088,10
Gates-MacGintie Reading Test, Level PR Kindergarten:
Full Sample
6/71 54.20
nr
46.40
nr
7.80 0.47 ns +18

Gates-MacGintie Reading Test, Level PR

Kindergarten:
Low Performers

6/24

55.80
nr

41.60
nr

14.20

1.51

Statistically significant

+43

ns = not statistically significant
nr = not reported

1This appendix presents subgroup and composite score findings for measures that fall in general reading achievement. In the case of Macaruso, Hook, and McCabe (2006), total group scores were used for rating purposes and are presented in Appendix A3.4. In the case of Macaruso and Walker (2008), subtest scores were used for rating purposes and are presented in Appendices A3.1 and A3.3.
2The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
3Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group.
4For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations.
5Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
6The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
7Means and standard deviations are ANCOVA-adjusted for pretest differences, as reported in communication with the author.
8The level of statistical significance was reported by the study authors or, when necessary, calculated by the WWC to correct for clustering within classrooms or schools (corrections for multiple comparisons were not done for findings not included in the overall rating). For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the cases of Macaruso, Hook, and McCabe (2006) and Macaruso and Walker (2008), a correction for clustering was needed, so the significance levels may differ from those reported in the original study.
9The study did not include specific information on the number of clusters (classrooms) across which the sub-sample of Title I students were distributed. The WWC assumed 10 classrooms, the number of classes in the full sample. With any fewer than 10 classrooms, however, this comparison would no longer be statistically significant.
10In this study, the authors did an ANCOVA-adjustment for pretest scores when calculating statistical significance but presented raw means and standard deviations.

Top

Appendix A5.1 Lexia reading rating for the alphabetics domain

The WWC rates an intervention’s effects in a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1

For the outcome domain of alphabetics, the WWC rated Lexia Reading as potentially positive. The remaining ratings (mixed effects, no discernible effects, potentially negative effects, and negative effects) were not considered, as Lexia Reading was assigned the highest applicable rating.

Rating received

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect.

    Met. One study showed a statistically significant positive effect.

AND
  • Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate effects than showing statistically significant or substantively important positive effects.

    Met. Neither study showed statistically significant or substantively important negative effects, and only one study showed indeterminate effects.

Other ratings considered

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design.

    Not met. Only one study showed a statistically significant positive effect.

AND
  • Criterion 2: No studies showing statistically significant or substantively important negative effects.

    Met. Neither study showed negative effects.

1For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Intervention Rating Scheme.

Top

Appendix A5.2 Lexia Reading rating for the fluency domain

The WWC rates an intervention's effects in a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1

For the outcome domain of fluency, the WWC rated Lexia Reading as having no discernible effects.

Rating received

No discernible effects: No affirmative evidence of effects.

  • Criterion 1: None of the studies shows a statistically significant or substantively important effect, either positive or negative.

    Met. Only one study examined outcomes in this domain, and the effect was neither statistically significant nor substantively important.

Other ratings considered

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design.

    Not met. Only one study examined outcomes in this domain.

AND
  • Criterion 2: No studies showing statistically significant or substantively important negative effects.

    Met. No study showed a statistically significant or substantively important negative effect.

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect.

    Not met. Only one study examined outcomes in this domain, and that study did not show a statistically significant or substantively important positive effect.

AND
  • Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate effects than showing statistically significant or substantively important positive effects.

    Met. No study showed a statistically significant or substantively important effect, either positive or negative.

Mixed effects: Evidence of inconsistent effects as demonstrated through either of the following criteria.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect, and at least one study showing a statistically significant or substantively important negative effect, but no more such studies than the number showing a statistically significant or substantively important positive effect.

    Not met. No study showed a statistically significant or substantively important effect, either positive or negative.

OR
  • Criterion 2: At least one study showing a statistically significant or substantively important effect, and more studies showing an indeterminate effect than showing a statistically significant or substantively important effect.

    Not met. No study showed a statistically significant or substantively important important effect.

Potentially negative effects: Evidence of a negative effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important negative effect.

    Not met. No study showed a statistically significant or substantively important negative effect.

AND
  • Criterion 2: No studies showing a statistically significant or substantively important positive effect, or more studies showing statistically significant or substantively important negative effects than showing statistically significant or substantively important positive effects.

    Met. No study showed a statistically significant or substantively important positive effect.

Negative effects: Strong evidence of a negative effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant negative effects, at least one of which met WWC evidence standards for a strong design.

    Not met. No study showed a statistically significant negative effect.

AND
  • Criterion 2: No studies showing statistically significant or substantively important positive effects.

    Met. No study showed a statistically significant or substantively important positive effect.

1For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Intervention Rating Scheme.

Top

Appendix A5.3 Lexia Reading rating for the comprehension domain

The WWC rates an intervention's effects in a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1

For the outcome domain of comprehension, the WWC rated Lexia Reading as potentially positive. The remaining ratings (mixed effects, no discernible effects, potentially negative effects, and negative effects) were not considered, as Lexia Reading was assigned the highest applicable rating.

Rating received

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect.

    Met. Only one study examined outcomes in this domain, and that study showed a substantively important positive effect.

AND
  • Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate effects than showing statistically significant or substantively important positive effects.

    Met. Only one study examined outcomes in this domain, and that study showed a substantively important positive effect.

Other ratings considered

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design.

    Not met. Only one study examined outcomes in this domain.

AND
  • Criterion 2: No studies showing statistically significant or substantively important negative effects.

    Met. Only one study examined outcomes in this domain, and that study showed a substantively important positive effect.

1For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Intervention Rating Scheme.

Top

Appendix A5.4 Lexia Reading rating for the general reading achievement domain

The WWC rates an intervention's effects in a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1

For the outcome domain of general reading achievement, the WWC rated Lexia Reading as having no discernible effects.

Rating received

No discernible effects: No affirmative evidence of effects.

  • Criterion 1: None of the studies shows a statistically significant or substantively important effect, either positive or negative.

    Met. Only one study examined outcomes in this domain, and the effect was neither statistically significant nor substantively important.

Other ratings considered

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design.

    Not met. Only one study examined outcomes in this domain.

AND
  • Criterion 2: No studies showing statistically significant or substantively important negative effects.

    Met. No study showed a statistically significant or substantively important negative effect.

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect.

    Not met. Only one study examined outcomes in this domain, and that study did not show a statistically significant or substantively important positive effect.

AND
  • Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate effects than showing statistically significant or substantively important positive effects.

    Met. Only one study examined outcomes in this domain, and that study showed an indeterminate effect.

Mixed effects: Evidence of inconsistent effects as demonstrated through either of the following criteria.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect, and at least one study showing a statistically significant or substantively important negative effect, but no more such studies than the number showing a statistically significant or substantively important positive effect.

    Not met. No studies showed a statistically significant or substantively important effect, either positive or negative.

OR
  • Criterion 2: At least one study showing a statistically significant or substantively important effect, and more studies showing an indeterminate effect than showing a statistically significant or substantively important effect.

    Not met. No studies showed a statistically significant or substantively important effect.

Potentially negative effects: Evidence of a negative effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important negative effect.

    Not met. No study showed a statistically significant or substantively important negative effect.

AND
  • Criterion 2: No studies showing a statistically significant or substantively important positive effect, or more studies showing statistically significant or substantively important negative effects than showing statistically significant or substantively important positive effects.

    Met. No study showed a statistically significant or substantively important positive effect.

Negative effects: Strong evidence of a negative effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant negative effects, at least one of which met WWC evidence standards for a strong design.

    Not met. No study showed a statistically significant negative effect.

AND
  • Criterion 2: No studies showing statistically significant or substantively important positive effects.

    Met. No study showed a statistically significant or substantively important positive effect.

1For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Intervention Rating Scheme.

Top

Appendix A6 Extent of evidence by domain

  Sample size
Outcome domain Number of studies Schools Students Extent of evidence1
Alphabetics 2 3 147 Small
Reading fluency 1 1 37 Small
Comprehension 1 2 71 Small
General reading achievement 1 5 167 Small

1A rating of "medium to large" requires at least two studies and two schools across studies in one domain and a total sample size across studies of at least 350 students or 14 classrooms. Otherwise, the rating is "small."

Top


PO Box 2393
Princeton, NJ 08543-2393
Phone: 1-866-503-6114