|Title:||Practical Tools for Large-Scale Evaluation of Text Data in Randomized Trials in Education|
|Principal Investigator:||Miratrix, Luke||Awardee:||Harvard University|
|Program:||Statistical and Research Methodology in Education [Program Details]|
|Award Period:||3 years (07/01/2022 – 06/30/2025)||Award Amount:||$894,352|
|Type:||Methodological Innovation||Award Number:||R305D220032|
Co-Principal Investigators: Mozer, Reagan; Al-Adeimi, Shireen
The purpose of this grant is to provide machine learning tools to improve the statistical power of an education impact analysis that relies on human-scoring of text passages written by students. Machine learning can be particularly useful in studies that have more text data than time, money, and/or coders to score all of the text data by human raters. By using the text and scores provided by the available human raters to train a machine learning algorithm, computer-scoring would then be used to score the remaining text passages. This would both increase the statistical power of text-based studies and leverage key aspects of machine learning to run additional analyses for investigating potential text pattern differences between the groups.
The research team will first develop new models for automated scoring that align with human judgment. The team will write these models into statistical software and then tested on real text datasets. The researchers will then expand the models and software to allow the machine learning algorithm to evaluate potential differences between experimental or quasi-experimental groups in the texts produced by the groups, not just average score differences between the groups. They will continue software testing on additional real datasets to gauge its performance at discerning text pattern differences. When the software is ready, the research team will create a free, web-based, user-friendly version for use by applied researchers. The team will then create instructional materials for use in workshops and short courses for teaching users how to use the software. The team will also produce manuscripts for publication in methodological and applied peer-reviewed journals.