Skip to main content

Breadcrumb

Home arrow_forward_ios Information on IES-Funded Research arrow_forward_ios Testing Different Methods of Improv ...
Home arrow_forward_ios ... arrow_forward_ios Testing Different Methods of Improv ...
Information on IES-Funded Research
Grant Closed

Testing Different Methods of Improving the External Validity of Impact Evaluations in Education

NCER
Program: Statistical and Research Methodology in Education
Program topic(s): Core
Award amount: $489,178
Principal investigator: Robert Olsen
Awardee:
Abt Associates, Inc.
Year: 2010
Project type:
Methodological Innovation
Award number: R305D100041

Purpose

This study was motivated by the observation that most major, multi-site evaluations in education have chosen participating sites (e.g., districts, schools, or grantees) "purposively" and not randomly. This raises possible concerns about the generalizability of the findings from these studies. The goal of this project was to provide evidence regarding the external validity of evaluations that are based on purposive samples.

Project Activities

Part 1 of the study considered how, and under what conditions, evaluations of educational programs that select sites purposively can produce externally valid impact estimates for the program as a whole. To address this question, this project (1) conducted simulations using data from real educational program evaluations to estimate how different the impacts are likely to be between purposive samples of sites and random samples of sites; (2) identified and developed methods for making findings from purposive samples more representative of program sites; (3) assessed the conditions under which these methods produce unbiased impact and standard error estimates; and (4) tested how well these methods work in real evaluations of educational programs.

People and institutions involved

IES program contact(s)

Allen Ruby

Associate Commissioner for Policy and Systems
NCER

Products and publications

ERIC Citations: Find available citations in ERIC for this award here.

Journal article, monograph, or newsletter

Bell, S.H., Olsen, R.B., Orr, L.L., and Stuart, E.A. (2016). Estimates of External Validity Bias When Impact Evaluations Select Sites Nonrandomly. Educational Evaluation and Policy Analysis, 38(2), 318-335.

Bell, S.H., and Stuart, E.A. (2016). On the "Where" of Social Experiments: The Nature and Extent of the Generalizability Problem. New Directions for Evaluation, 2016(152), 47-59.

Olsen, R.B., and Orr, L.L. (2016). On the "Where" of Social Experiments: Selecting More Representative Samples to Inform Policy. New Directions for Evaluation, 2016(152), 61-71.

Olsen, R.B., Orr, L.L., Bell, S.H., and Stuart, E.A. (2013). External Validity in Policy Evaluations That Choose Sites Purposively. Journal of Policy Analysis and Management, 32(1): 107-121.

Supplemental information

Co-Principal Investigator: Bell, Stephen

Part 2 of the study considered how, and under what conditions, evaluations of educational interventions can produce externally valid estimates of the interventions' impacts for schools and districts that did not participate in the evaluation. To address this question, the project (1) identified and developed methods for predicting the impacts of an intervention for sites that are not participating in the evaluation; (2) assessed the conditions under which these methods produce unbiased impact and standard error estimates for these sites; and (3) tested how well these methods work in real evaluations of educational programs (whether how well they work depends on how sites were selected for the evaluation).

In both parts of the study, the researchers reanalyzed data from the National Evaluation of Upward Bound (which randomly selected 70 programs nationwide) and the second Reading First Implementation Study (which collected student information for all students within a state). In Part 1 of the study, the study team used these data to simulate both representative and purposive samples, estimate the average impacts from these samples, and compare the results. Then the study tested whether regression-based methods and weighting methods can "close the gap" and produce impact estimates from the purposive samples that are more similar to the impact estimates from purposive samples. In Part 2 of the study, the study team assessed how accurate average impact estimates are in predicting the impacts for individual sites that did not participate in the evaluation—but may use evaluation results in deciding whether to adopt the intervention. In addition, they tested whether regression-based methods or weighting methods can yield improved predictions of the intervention's impact for these sites.

Questions about this project?

To answer additional questions about this project or provide feedback, please contact the program officer.

 

Tags

MathematicsData and Assessments

Share

Icon to link to Facebook social media siteIcon to link to X social media siteIcon to link to LinkedIn social media siteIcon to copy link value

Questions about this project?

To answer additional questions about this project or provide feedback, please contact the program officer.

 

You may also like

Zoomed in IES logo
Workshop/Training

Data Science Methods for Digital Learning Platform...

August 18, 2025
Read More
Zoomed in IES logo
Workshop/Training

Meta-Analysis Training Institute (MATI)

July 28, 2025
Read More
Zoomed in Yellow IES Logo
Workshop/Training

Bayesian Longitudinal Data Modeling in Education S...

July 21, 2025
Read More
icon-dot-govicon-https icon-quote