Skip Navigation
Funding Opportunities | Search Funded Research Grants and Contracts

IES Grant

Title: Practical Tools for Large-Scale Evaluation of Text Data in Randomized Trials in Education
Center: NCER Year: 2022
Principal Investigator: Miratrix, Luke Awardee: Harvard University
Program: Statistical and Research Methodology in Education      [Program Details]
Award Period: 3 years (07/01/2022 – 06/30/2025) Award Amount: $894,352
Type: Methodological Innovation Award Number: R305D220032

Co-Principal Investigators: Mozer, Reagan; Al-Adeimi, Shireen

The purpose of this grant is to provide machine learning tools to improve the statistical power of an education impact analysis that relies on human-scoring of text passages written by students. Machine learning can be particularly useful in studies that have more text data than time, money, and/or coders to score all of the text data by human raters. By using the text and scores provided by the available human raters to train a machine learning algorithm, computer-scoring would then be used to score the remaining text passages. This would both increase the statistical power of text-based studies and leverage key aspects of machine learning to run additional analyses for investigating potential text pattern differences between the groups.

The research team will first develop new models for automated scoring that align with human judgment. The team will write these models into statistical software and then tested on real text datasets. The researchers will then expand the models and software to allow the machine learning algorithm to evaluate potential differences between experimental or quasi-experimental groups in the texts produced by the groups, not just average score differences between the groups. They will continue software testing on additional real datasets to gauge its performance at discerning text pattern differences. When the software is ready, the research team will create a free, web-based, user-friendly version for use by applied researchers. The team will then create instructional materials for use in workshops and short courses for teaching users how to use the software. The team will also produce manuscripts for publication in methodological and applied peer-reviewed journals.