|Title:||Improving the Power of Education Experiments with Auxiliary Data|
|Principal Investigator:||Gagnon-Bartsch, Johann||Awardee:||University of Michigan|
|Program:||Statistical and Research Methodology in Education [Program Details]|
|Award Period:||3 years (03/01/2021 - 02/29/2024)||Award Amount:||$576,429|
|Type:||Methodological Innovation||Award Number:||R305D210031|
Co-Principal Investigators: Heffernan III, Neil; Sales, Adam
The purpose of this project is to develop novel methodology to estimate treatment effects from randomized controlled trials (RCTs), while incorporating large observational remnant data and cutting-edge machine learning prediction algorithms to improve precision. The statistical precision of effect estimates from an RCT is limited by the RCT's sample size, which itself is typically subject to a number of practical constraints, such as cost. In many cases, RCT estimates may be too imprecise to guide policy or inform science, and this problem is particularly acute in the case of subgroup analyses. The research team will develop statistical methods and data science tools to combine data from RCTs in education with "auxiliary data" gathered from large administrative databases: that is, covariate and outcome data on students or schools that did not participate in the RCT. Precision gains derived from the use of these data would increase the effective sample size, potentially increasing statistical power, or reducing costs and allowing more efficient use of resources, or both. Added precision could allow for improved subgroup analyses and estimates of effect variability, resulting in broader generalizability of the results.
The research team will adapt this framework to common RCT designs and data structures in education research, including blocked-cluster randomized trials and longitudinal data measurements. The framework will also be developed to handle common methodological issues in education research, such as estimating subgroup effects, generalizability, and analyzing data from RCTs that are "broken" due to attrition, test opt-out, or other post-randomization selection. The main product of this research will be flexible, user-friendly, open-source software available to and readily usable by applied education researchers.