Project Activities
The project team will extend the Bayesian Additive Regression Trees (BART) modeling framework to include new features that are more relevant to analysis of complex research designs in education. The extensions to BART will be implemented in the thinkCausal web app, and disseminated through research papers, classroom instruction, and workshops at education research conferences.
Structured Abstract
Research design and methods
The research team will capitalize on the strengths of the existing BART framework for causal inference and extend it to include the following features: multivariate models for multiple outcomes with explicit covariance structures, targeted priors for multiple treatments and subgroups, and principled strategies for detection of effect moderation. The BART framework will naturally allow for estimation of effects using flexible nonparametric response surfaces and current overlap diagnostics will be extended to handle the additional estimands induced by this framework. This software will be embedded in the existing thinkCausal framework — a causal inference tool that scaffolds analysis and provides opportunities for users to learn about the underlying concepts. A novel pre-registration wizard will help researchers develop principled but flexible analysis strategies, keeping them honest while allowing maximal use of existing data.
User Testing: The project team will use strategies including: 1) gathering feedback from users on ease of use and 2) a randomized trial to assess whether researcher conclusions using the new tool are better than those reached with standard software.
Use in Applied Education Research: It's common for educational studies to collect data on more than one treatment or more than one outcome, or to examine moderation across several different subgroups. It's likely that studies that produce estimates on a set of research questions are presenting estimates and associated uncertainty quantifications (standard errors, p-values, or confidence intervals) that are overly confident and underestimate what could be replicated (in terms of statistical significance, magnitude, or direction) in future studies. The product of this grant would be immediately useful in a wide variety of education research settings for creating more replicable estimates of these quantities.
The project team will also address four specific aspects of applied research:
- Supporting the Standards for Excellence in Education Research (SEER, /seer).
- The project will create a "wizard" to help researchers with pre-registration.
- The modeling tools developed will "robustify" the estimation of heterogeneous treatment effects which is crucial to generalizability of findings.
- All methods will be open source and the analysis tool will automatically document analytic choices as they are made, simplifying and ensuring reproducibility.
- Tools will be made available in an accessible, easy-to-use and scaffolded software tool that doesn't require technical expertise and helps the user understand the underlying concepts.
- Quasi Experimental Designs. The project framework is based on Bayesian Additive Regression Trees, a machine learning algorithm demonstrated to be a strong, often superior competitor to quasi-experimental methods (for example, propensity score matching).
- Variability in effects. The project's modeling strategy has been shown to have strong performance in estimating variable treatment effects.
- Data Science Tools for Education Researchers: The expanded thinkCausal software package is built in the RShiny language to allow for easy access to the sophisticated algorithms for which it acts as an overlay.
People and institutions involved
IES program contact(s)
Project contributors
Products and publications
Products: The project team will create an easy-to use software tool that mitigates many of the threats to replicability induced by multiple testing and researcher degrees of freedom in observational causal inference settings with multiple treatments, outcomes, or moderators. This approach will simultaneously estimate multiple effects in randomized or observational studies with these complex elements in a way that not only diagnoses and reduces these threats, but appropriately accounts for uncertainty. This software will be embedded in a user-friendly tool that also includes a "wizard" to create pre-registration plans that make efficient use of the available data while controlling for researcher degrees of freedom. The project team will disseminate results using a variety of strategies, including methodological research papers, applied education research papers, presentations at both methods and education conferences, software available in R, user-friendly software available through a web-server, workshops at education conferences, videos describing how to use the software.
ERIC Citations: Find available citations in ERIC for this award here.
Related projects
Supplemental information
Co-Principal Investigators: Hill, Jennifer; Perrett, George
Questions about this project?
To answer additional questions about this project or provide feedback, please contact the program officer.