Catalyzing Comprehension Through Discussion and Debate

Year: 2010
Name of Institution:
Strategic Education Research Partnership Institute
Goal: Multiple Goals
Principal Investigator:
Donovan, Suzanne
Award Amount: $19,352,384
Award Period: 5 years
Award Number: R305F100026


Overview: The Reading for Understanding Initiative was created to develop effective approaches to improving reading comprehension for all students in prekindergarten through grade 12. This grant is one of six awarded as part of this Initiative. The purpose of this grant was to explore the contributions of perspective taking, complex reasoning, and academic language skills to reading for understanding for upper elementary and middle school students; to develop, refine, and test the efficacy of two literacy programs that target the development of these skills for a general population of students and for struggling readers; and to develop professional learning opportunities for science teachers that could serve as a model for other content areas regarding discipline-specific challenges of reading and argumentation.  The grant covered four major activities: basic studies, development of interventions, professional development, and efficacy studies.

Basic Studies: The research team hypothesized that deep reading comprehension is a function of simple comprehension (i.e., understanding the meaning of individual words and sentences), background knowledge, and three independent, malleable capacities that had been given little attention previously: perspective taking, academic language skills, and complex reasoning. In order to test the theory, the team developed instruments to measure each of the three capacities: the Social Perspective Taking Acts Measure (SPTAM), Core Academic Language Skills Instrument (CALS-I), and Reflective Judgment Test (RFJ) for complex reasoning. Each of the instruments was independently validated (Diazgranados, Selman, & Dionne, 2016; Uccelli, Barr, Dobbs, Phillips Galloway, Meneses, & Sánchez, 2015).

The team collected longitudinal data over three years in order to explore developmental trajectories for each of the three capacities. The team also conducted studies to explore whether perspective taking, complex reasoning, and academic language: (a) are independent of each other, and (b) can account for variation in comprehension outcomes. The team explored whether the predictive value of the measures varies with different comprehension assessments and in different classroom contexts.

To analyze how these capacities relate to deep comprehension, the team used longitudinal data collected with the Global Integrated Scenario-based Assessment (GISA), a newly developed computer-based assessment designed to reflect students' abilities to evaluate texts, integrate information from multiple texts, and use textual evidence to formulate a position, all features of deep reading comprehension. Utilizing data collected in fall and spring of one academic year with fourth-through-seventh grade students, researchers tested the role of academic language, perspective taking, and complex reasoning in explaining variance in end-of-year GISA scores, controlling for beginning-of-year scores and student demographics. All three predictors explained small, but significant, amounts of additional variance. (LaRusso, Kim, Jones, Selman, Uccelli, Jones, Donovan, and Snow, 2016).

Key Personnel: Robert Selman (Harvard), Paola Uccelli (Harvard), Theo Dawson (Lectica), Kurt Fischer (Harvard), Maria LaRusso (SERP/Harvard), Suzanne Donovan (SERP), Catherine Snow (Harvard)

Development of Interventions: The research team developed multiple programs that rely on discussion and debate to catalyze the growth of academic language skills, perspective taking ability, and complex reasoning. The first suite of programs, Word Generation (WG), is a set of tier 1, cross content-area programs for students in grades 4-8 ( Researchers also developed a tier 2 program—the Strategic Adolescent Reading Intervention (STARI)—that targets middle school students reading several grade levels below expectation ( It is intended to build their deep comprehension skills at the same time that more basic reading skills are addressed.

The Word Generation suite is comprised of WordGen Weekly, Science Generation, Social Studies Generation, and WordGen Elementary. WordGen Weekly is a middle school program that exposes students to academic vocabulary, builds perspective-taking skills by providing multiple viewpoints on high-interest, controversial topics, and motivates complex reasoning through the demands of discussion, debate, and writing. This project supported updating, revising, and/or replacing the 72 previously developed units that constituted the original Word Generation program, now called WordGen Weekly. To extend WG into more in-depth treatment of content-area topics in middle school, the project developed 36 entirely new units. Eighteen of the units constitute Social Science Generation (SoGen), exploring ancient civilizations, geography, and civics. The other 18 units comprise Science Generation (SciGen), treating scientific processes and methods, scientific measurement, ecology, force, energy, and genetics. The project also developed WordGen Elementary, a new program that builds academic language skills, perspective taking, and complex reasoning in fourth and fifth grades. Like the middle-school units, the 24 two-week WordGen Elementary units focus on grade-appropriate, high-interest topics, incorporate instructional materials and tasks that cross content areas, and rely on discussion as a mechanism to catalyze the development of the target capacities.

The STARIintervention has a familial relationship with Word Generation; it too relies on high-interest topics, and uses discussion and debate to actively engage students in perspective-taking, complex reasoning, and the use of academic language. STARI is highly structured to support students' word reading, fluency, and simple comprehension skills simultaneously. It uses full-length novels of high interest to students, related nonfiction readings, and project-developed fluency passages at four levels of reading difficulty that relate to the unit theme and build relevant background knowledge. STARIprovides materials for two years of instruction, thus offering an option for students to continue if they do not make adequate progress for grade-level work in one year. Level 1 (for sixth/seventh graders) targets the same skill set as level 2 (for seventh/eighth graders), but the topics treated in level 2 are somewhat more sophisticated.

Key Personnel: Catherine Snow (Harvard), Lowry Hemphill (Wheelock), Matthew Ellinger (SERP), Suzanne Donovan (SERP)

Professional Development Model: The research team developed a set of strategies and materials to support the professional learning required to implement WG and STARI. For WG, the challenge of leading effective discussion and debate is a central goal of the professional development (PD). STARI shares that goal, but also provides more intensive PD on building basic reading skills, and on program-specific strategies such as partnered reading.

The challenges of discussion and debate, and of reading for understanding more generally, are magnified in science for two reasons: (1) scientific knowledge is considered by many teachers to be settled, leaving little to discuss; and (2) science texts use disciplinary conventions that, until mastered, make comprehension difficult. The project therefore developed and pilot-tested a model of professional development to enhance the skills of middle and elementary teachers of science to teach students how to construct meaning from science texts, with a particular focus on the use of discussion to support comprehension in science classrooms. Video and other data were collected to examine the nature of the change achieved in teaching practice at three points across the two years. Successful strategies that emerged from the work were incorporated into a project website on Reading to Learn in Science (, and the Stanford team has offered two iterations of a Reading to Learn in Science MOOC that has reached thousands of participants.

Key Personnel: Jonathan Osborne (Stanford), Catherine O'Connor (Boston University), Lowry Hemphill (Wheelock), Catherine Snow (Harvard), Suzanne Donovan (SERP)

Efficacy Studies: The suite of Word Generation programs was tested in a school-randomized controlled trial in four districts, located in two states in the northeastern region of the United States. The districts include two major cities and one small city serving ethnically diverse, primarily low income students, and one suburban district serving a primarily white, low-to-middle income population. The study measured the efficacy of the programs under realistic rather than high fidelity conditions. Summer institutes were offered, but many participating teachers were unable to attend. Coaching was made available, but in accordance with standard practice, teachers' engagement with coaches was voluntary. The level of implementation varied, with average student workbook completion ranging between 46% and 57% across years and grades.

The team conducted analyses with a sample of 7,773 fourth-through-seventh grade students nested in 25 schools randomized to treatment and control conditions over two years. Results indicate that the program improved students' knowledge of taught vocabulary for both the elementary cohort and middle grades cohort. Students in fourth and fifth grades who received the program in year 2 improved on measures of academic language skills, perspective taking and deep reading comprehension, with larger effects in high implementing classrooms. In sixth and seventh grade classrooms, students showed significant gains in perspective taking and deep comprehension in year 2 (Jones, LaRusso, Kim, Kim, Selman, Uccelli, Barnes, Donovan & Snow, under revision).

To examine the impact of WG on the quality of classroom interactions, the team conducted observations twice per year for two years and coded on six dimensions of the Classroom Assessment Scoring System (CLASS): Regard for Adolescent Perspectives, Instructional Learning Formats, Content Understanding, Analysis and Inquiry, Quality of Feedback, and Instructional Dialogue. The team hypothesized that WG classrooms would be rated higher on Regard for Adolescent Perspectives, Analysis and Inquiry, and Instructional Dialogue—dimensions that are closely aligned with WG's theory of change and program design. In both Years 1 and 2, WG classrooms observed implementing WG had significantly higher quality interactions than control classrooms for all dimensions, with stronger impacts for the dimensions most closely aligned with program theory. WG classrooms were also observed during regular practice (not implementing the WG program) and in year 1, the quality of classroom interactions was not statistically different from interactions observed in control classrooms. However, in year 2, there was evidence that changes in classroom interactions of WG classrooms carried over into regular practice, with positive, significant treatment differences for the three dimensions most closely aligned with program theory: Regard for Adolescent Perspectives, Analysis and Inquiry, and Instructional Dialogue (LaRusso, Jones, Kim, Kim, Barnes, Donovan & Snow, results presented at the 2016 meeting for the Society for Research on Educational Effectiveness).

In order to conduct a close examination of structured discussion in classrooms, the team developed a new tool: the Low Inference Discourse Observation tool (LIDO). Preliminary analyses provide evidence of its reliability and validity. IRT analyses produced two scores: one for teacher conversational moves, and one for student conversational moves. Researchers found statistically significant, positive treatment effects for both teacher and student conversational moves (O'Connor, LaRusso, Harbaugh, results presented at the 2016 meeting of the American Education Research Association).

STARIwas evaluated under realistic rather than ideal conditions in a student-randomized controlled trial (n=483) in four districts in diverse communities in a single state. The program's impact was analyzed using both intent-to-treat (ITT) and treatment-on-the-treated (TOT) models. The ITT effects were positive and statistically significant for four out of the six outcomes measured by the RISE assessment: word recognition and decoding, vocabulary, morphology, and efficiency of basic reading.  The TOT effects were nearly double the ITT effects, suggesting that greater exposure to the STARI curriculum improved student outcomes. The GISA was administered separately as a measure of reading for understanding, and demonstrated a statistically significant impact of STARI for both the ITT analysis and the TOT analysis (Kim, Hemphill, Troyer, Thomson, Jones, LaRusso & Donovan, 2016).

Key Personnel: Stephanie Jones (Harvard), James Kim (Harvard), Maria LaRusso (SERP/Harvard), Catherine Snow (Harvard), Lowry Hemphill (Wheelock), Catherine O'Connor (Boston University), Suzanne Donovan (SERP)


