Skip Navigation
Funding Opportunities | Search Funded Research Grants and Contracts

IES Grant

Title: Model-based Multiple Imputation for Multilevel Data: Methodological Extensions and Software Enhancements
Center: NCER Year: 2019
Principal Investigator: Enders, Craig Awardee: University of California, Los Angeles
Program: Statistical and Research Methodology in Education      [Program Details]
Award Period: 3 years (07/01/19 – 06/30/22) Award Amount: $868,046
Type: Methodological Innovation Award Number: R305D190002

Co-Principal Investigators: Du, Han; Keller, Brian

Purpose: Missing data are exceedingly common in educational research. Education research studies have missing data because students opt out of achievement testing, skip test items, or move to a different school district, among many other reasons. A previous IES award funded the development of a data analysis application, called Blimp, that addresses this issue by filling in missing values using sophisticated predictive models. The purpose of the current work was twofold: expand Blimp's missing data imputation features to handle a broader range of reasons for missing data and further develop the software into a general-use data analysis program.

Project Activities: The project team increased the capabilities of the Blimp software application by including a set of not missing at random. In addition, the team added residual-based model diagnostics, estimation for count outcomes, dispersion modeling features, and sampling weights.

Key Outcomes

Structured Abstract

Statistical/Methodological Product: Blimp is a general-use data analysis application available for MacOS, Windows, and Linux. The software was created for education researchers with incomplete data sets. Blimp's suite of data analytic capabilities additionally makes the software broadly applicable to researchers in the social, behavioral, and medical sciences, among others. The software is available from the project's website (, and a detailed user guide is available through the graphical interface's Help pull-down menu. The project also produced several peer-reviewed publications that provide technical details about the software's algorithms.

Development/Refinement Process: Developing reliable and accurate data analysis tools requires intensive testing and development and a high level of quality control. The development process involved the following steps:

  • Members of the research team developed a technical appendix that describes the procedure to be implemented, its algorithmic details, and relevant extant literature.
  • Initial coding and testing were conducted in R.
  • The programmer refined the initial code and implemented the new procedure into Blimp's C++ codebase. At this point, new functionality was available in a special version of the computational engine that team members accessed for testing.
  • The research team conducted exhaustive and extensive tests using computer simulation studies and benchmarking against other software packages, when available. The testing also involved applying the new methodology to numerous real data sets.
  • New functionality was made available in a special beta version of the computational engine that members of the public could access by listing a shebang at the top of the script.
  • Additional testing on real-world data sets was done.
  • The beta features were moved to the software's public build.
  • Updates continue to be distributed in real time over the internet when the user launches the software.

User Testing: The research team relied on three sources of user testing. First, team members used Blimp in their graduate-level statistics courses. Second, researchers worldwide. Observing the target audience interacting with the software in these settings has been an important source of user testing and feedback. Third, the team maintains a user support email to assist practicing researchers. Addressing user inquires has allowed the team to observe the software's behavior across a vast collection of real-word data structures. Collectively, these user testing experiences have allowed the research team to continually refine the graphical interface and printed output to enhance and simplify the user experience.

Related IES Projects: Multiple Imputation Procedures for Multilevel Data (R305D15056), Model-based Multiple Imputation for Multilevel Data: Methodological Extensions and Software Enhancements (R305D190002), Dealing with Missing Data in Educational Research: Methodological Innovations and Contemporary Recommendations (R305D22001)

Publications and Products

ERIC Citations: Find available citations in ERIC for this award here

Project Website:

Additional Online Resources and Information:

Select Publications:

Alacam, E., Enders, C.K., Du, H., & Keller, B.T. (2023). A factored regression model for composite scores with item-level missing data. Psychological Methods, Advanced online publication.

Du, H., Keller, B. T., Alacam, E., & Enders, C. K. (2023). Comparing DIC and WAIC for multilevel models with missing data. Behavior Research Methods. 1–20.

Du, H., Enders, C.K., Keller, B.T., Bradbury, T., & Karney, B. (2022). A Bayesian latent variable selection model for nonignorable missingness. Multivariate Behavioral Research, 57, 478–512.

Keller, B. T. (2022). An introduction to factored regression models with Blimp. Psych, 4(1), 10–37.

Keller, B.T., & Enders, C.K. (2023). An investigation of factored regression missing data methods for multilevel models with cross-level interactions. Multivariate Behavioral Research, 58, 938–963.