Identifying and Implementing Educational Practices Supported By Rigorous Evidence: A User Friendly Guide

December 2003

References

¹ Evidence from randomized controlled trials, discussed in the following journal articles, suggests that one-on-one tutoring of at-risk readers by a well-trained tutor yields an effect size of about 0.7. This means that the average tutored student reads more proficiently than approximately 75 percent of the untutored students in the control group. Barbara A. Wasik and Robert E. Slavin, "Preventing Early Reading Failure With One-To-One Tutoring: A Review of Five Programs,"Reading Research Quarterly, vol. 28, no. 2, April/May/June 1993, pp. 178-200 (the three programs evaluated in randomized controlled trials produced effect sizes falling mostly between 0.5 and 1.0). Barbara A. Wasik, "Volunteer Tutoring Programs in Reading: A Review," Reading Research Quarterly, vol. 33, no. 3, July/August/September 1998, pp. 266-292 (the two programs using well-trained volunteer tutors that were evaluated in randomized controlled trials produced effect sizes of 0.5 to 1.0, and .50, respectively). Patricia F. Vadasy, Joseph R. Jenkins, and Kathleen Pool, "Effects of Tutoring in Phonological and Early Reading Skills on Students at Risk for Reading Disabilities, Journal of Learning Disabilities, vol. 33, no. 4, July/August 2000, pages 579-590 (randomized controlled trial of a program using well-trained nonprofessional tutors showed effect sizes of 0.4 to 1.2).

² Gilbert J. Botvin et. al., "Long-Term Follow-up Results of a Randomized Drug Abuse Prevention Trial in a White, Middle-class Population," Journal of the American Medical Association, vol. 273, no. 14, April 12, 1995, pp. 1106-1112. Gilbert J. Botvin with Lori Wolfgang Kantor, "Preventing Alcohol and Tobacco Use Through Life Skills Training: Theory, Methods, and Empirical Findings," Alcohol Research and Health, vol. 24, no. 4, 2000, pp. 250-257.

³ Frederick Mosteller, Richard J. Light, and Jason A. Sachs, "Sustained Inquiry in Education: Lessons from Skill Grouping and Class Size," Harvard Education Review, vol. 66, no. 4, winter 1996, pp. 797-842. The small classes averaged 15 students; the regular-sized classes averaged 23 students.

⁴ These are the findings specifically of the randomized controlled trials reviewed in "Teaching Children To Read: An Evidence-Based Assessment of the Scientific Research Literature on Reading and Its Implications for Reading Instruction," Report of the National Reading Panel, 2000.

⁵ Frances A. Campbell et. al., "Early Childhood Education: Young Adult Outcomes From the Abecedarian Project," Applied Developmental Science, vol. 6, no. 1, 2002, pp. 42-57. Craig T. Ramey, Frances A. Campbell, and Clancy Blair, "Enhancing the Life Course for High-Risk Children: Results from the Abecedarian Project," in Social Programs That Work, edited by Jonathan Crane (Russell Sage Foundation, 1998), pp. 163-183.

⁶ For example, randomized controlled trials showed that (i) welfare reform programs that emphasized short-term job-search assistance and encouraged participants to find work quickly had larger effects on employment, earnings, and welfare dependence than programs that emphasized basic education; (ii) the work-focused programs were also much less costly to operate; and (iii) welfare-to-work programs often reduced net government expenditures. The trials also identified a few approaches that were particularly successful. See, for example, Manpower Demonstration Research Corporation, National Evaluation of Welfare-to-Work Strategies: How Effective Are Different Welfare-to-Work Approaches? Five-Year Adult and Child Impacts for Eleven Programs (U.S. Department of Health and Human Services and U.S. Department of Education, November 2001). These valuable findings were a key to the political consensus behind the 1996 federal welfare reform legislation and its strong work requirements, according to leading policymakers - including Ron Haskins, who in 1996 was the staff director of the House Ways and Means Subcommittee with jurisdiction over the bill.

⁷ See, for example, the Food and Drug Administration's standard for assessing the effectiveness of pharmaceutical drugs and medical devices, at 21 C.F.R. ¡±314.126. See also, "The Urgent Need to Improve Health Care Quality," Consensus statement of the Institute of Medicine National Roundtable on Health Care Quality, Journal of the American Medical Association, vol. 280, no. 11, September 16, 1998, p. 1003; and Gary Burtless, "The Case for Randomized Field Trials in Economic and Policy Research," Journal of Economic Perspectives, vol. 9, no. 2, spring 1995, pp. 63-84.

⁸ Robert G. St. Pierre et. al., "Improving Family Literacy: Findings From the National Even Start Evaluation," Abt Associates, September 1996.

⁹ Jean Baldwin Grossman, "Evaluating Social Policies: Principles and U.S. Experience," The World Bank Research Observer, vol. 9, no. 2, July 1994, pp. 159-181.

¹⁰ Roberto Agodini and Mark Dynarski, "Are Experiments the Only Option? A Look at Dropout Prevention Programs," Mathematica Policy Research, Inc., August 2001, at http://www.mathematica-mpr.com/PDFs/redirect.asp?strSite=experonly.pdf.

¹¹ Elizabeth Ty Wilde and Rob Hollister, "How Close Is Close Enough? Testing Nonexperimental Estimates of Impact against Experimental Estimates of Impact with Education Test Scores as Outcomes," Institute for Research on Poverty Discussion paper, no. 1242-02, 2002, at http://www.ssc.wisc.edu/irp/.

¹²Howard S. Bloom et. al., "Can Nonexperimental Comparison Group Methods Match the Findings from a Random Assignment Evaluation of Mandatory Welfare-to-Work Programs?" MDRC Working Paper on Research Methodology, June 2002, at http://www.mdrc.org/ResearchMethodologyPprs.htm. James J. Heckman, Hidehiko Ichimura, and Petra E. Todd, "Matching As An Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme," Review of Economic Studies, vol. 64, no. 4, 1997, pp. 605-654. Daniel Friedlander and Philip K. Robins, "Evaluating Program Evaluations: New Evidence on Commonly Used Nonexperimental Methods," American Economic Review, vol. 85, no. 4, September 1995, pp. 923-937; Thomas Fraker and Rebecca Maynard, "The Adequacy of Comparison Group Designs for Evaluations of Employment-Related Programs," Journal of Human Resources, vol. 22, no. 2, spring 1987, pp. 194-227; Robert J. LaLonde, "Evaluating the Econometric Evaluations of Training Programs With Experimental Data," American Economic Review, vol. 176, no. 4, September 1986, pp. 604-620.

¹³ This literature, including the studies listed in the three preceding endnotes, is systematically reviewed in Steve Glazerman, Dan M. Levy, and David Myers, "Nonexperimental Replications of Social Experiments: A Systematic Review," Mathematica Policy Research discussion paper, no. 8813-300, September 2002. The portion of this review addressing labor market interventions is published in "Nonexperimental versus Experimental Estimates of Earnings Impact," The American Annals of Political and Social Science, vol. 589, September 2003.

¹⁴ J.E. Manson et. al, "Estrogen Plus Progestin and the Risk of Coronary Heart Disease," New England Journal of Medicine, August 7, 2003, vol. 349, no. 6, pp. 519-522. International Position Paper on Women's Health and Menopause: A Comprehensive Approach, National Heart, Lung, and Blood Institute of the National Institutes of Health, and Giovanni Lorenzini Medical Science Foundation, NIH Publication No. 02-3284, July 2002, pp. 159-160. Stephen MacMahon and Rory Collins, "Reliable Assessment of the Effects of Treatment on Mortality and Major Morbidity, II: Observational Studies," The Lancet, vol. 357, February 10, 2001, p. 458. Sylvia Wassertheil-Smoller et. al., "Effect of Estrogen Plus Progestin on Stroke in Postmenopausal Women - The Women's Health Initiative: A Randomized Controlled Trial, Journal of the American Medical Association, May 28, 2003, vol. 289, no. 20, pp. 2673-2684.

¹⁵ Howard S. Bloom, "Sample Design for an Evaluation of the Reading First Program," an MDRC paper prepared for the U.S. Department of Education, March14, 2003. Robert E. Slavin, "Practical Research Designs for Randomized Evaluations of Large-Scale Educational Interventions: Seven Desiderata," paper presented at the annual meeting of the American Educational Research Association, Chicago, April, 2003.

¹⁶ The "standardized effect size" is calculated as the difference in the mean outcome between the treatment and control groups, divided by the pooled standard deviation.

¹⁷ Rory Collins and Stephen MacMahon, "Reliable Assessment of the Effects of Treatment on Mortality and Major Morbidity, I: Clinical Trials," The Lancet, vol. 357, February 3, 2001, p. 375.

¹⁸ Robinson G. Hollister, "The Growth of After-School Programs and Their Impact," paper commissioned by the Brookings Institution's Roundtable on Children, February 2003, at http://www.brook.edu/dybdocroot/views/papers/sawhill/20030225.pdf. Myles Maxfield, Allen Schirm, and Nuria Rodriguez-Planas, "The Quantum Opportunity Program Demonstration: Implementation and Short-Term Impacts," Mathematica Policy Research (no. 8279-093), August 2003.

¹⁹ Guidance for Industry: Providing Clinical Evidence of Effectiveness for Human Drugs and Biological Products, Food and Drug Administration, May 1998, pp. 2-5

²⁰ Robert J. Temple, Director of the Office of Medical Policy, Center for Drug Evaluation and Research, Food and Drug Administration, quoted in Gary Taubes, "Epidemiology Faces Its Limits," Science, vol. 269, issue 5221, p. 169.

²¹ Debra Viadero, "Researchers Debate Impact of Tests," Education Week, vol. 22, no. 21, February 5, 2003, page 1.

²² E. Barrett-Connor and D. Grady, "Hormone Replacement Therapy, Heart Disease, and Other Considerations," Annual Review of Public Health, vol. 19, 1998, pp. 55-72.

²³ Frederick Mosteller, Richard J. Light, and Jason A. Sachs, op. cit., no. 3.

²⁴ Brian Stecher et. all, "Class-Size Reduction in California: A Story of Hope, Promise, and Unintended Consequences," Phi Delta Kappan, Vol. 82, Iss. 9, May 2001, pp. 670-674.

²⁵ David L. Olds et. al., "Long-term Effects of Nurse Home Visitation on Children's Criminal and Antisocial Behavior: 15-Year Follow-up of a Randomized Controlled Trial," Journal of the American Medical Association, vol. 280, no. 14, October 14, 1998, pp. 1238-1244. David L. Olds et. al., "Long-term Effects of Home Visitation on Maternal Life Course and Child Abuse and Neglect: 15-Year Follow-up of a Randomized Trial," Journal of the American Medical Association, vol. 278, no. 8, pp. 637-643. David L. Olds et. al, "Home Visiting By Paraprofessionals and By Nurses: A Randomized, Controlled Trial," Pediatrics, vol. 110, no. 3, September 2002, pp. 486-496. Harriet Kitzman et. al., "Effect of Prenatal and Infancy Home Visitation by Nurses on Pregnancy Outcomes, Childhood Injuries, and Repeated Childbearing," Journal of the American Medical Association, vol. 278, no. 8, August 27, 1997, pp. 644-652.

²⁶ For example, see Robert G. St. Pierre et. al., op. cit., no. 8; Karen McCurdy, "Can Home Visitation Enhance Maternal Social Support?" American Journal of Community Psychology, vol. 29, no. 1, 2001, pp. 97-112.

Top