Skip Navigation
Funding Opportunities | Search Funded Research Grants and Contracts

IES Grant

Title: Methods for Addressing Measurement Error Issues in Longitudinal Educational Studies
Center: NCER Year: 2016
Principal Investigator: Wang, Chun Awardee: University of Washington
Program: Statistical and Research Methodology in Education–Early Career      [Program Details]
Award Period: 2 years (9/1/2016 – 8/31/2018) Award Amount: $195,382
Type: Methodological Innovation Award Number: R305D170042

Previous Grant Number: R305D160010
Previous Awardee: University of Minnesota

Co-Principal Investigator: Xu, Gongjun

Purpose: The purpose of this project was to investigate different methods of using a two-stage framework to address measurement error in the estimation of latent theta scores obtained from standardized tests through item response theory (IRT). In the two-stage approach, an appropriate measurement model is first fitted to the data, and the resulting theta scores are used in subsequent analysis. Potential benefits of this approach include clearer definition of factors, convenience for secondary data analysis, convenience for model calibration and fit evaluation, and avoidance of improper solutions. Measurement errors are accounted for in the second-stage statistical analysis by combining the measurement error model and structural model in the same analysis or via a corrective approach, in which estimated thetas are first used in regression analysis but then the bias of the estimates of the regression coefficients and their standard errors are corrected in a post-hoc fashion.

Project Activities: The researchers first completed the mathematical theory needed for three of the five proposed methods of accounting for measurement error in test scores. After programming everything in the R software package, the researchers conducted simulation studies to evaluate the performance of the proposed approaches in a number of conditions similar to those found with national and state education-related datasets. The researchers used the National Education Longitudinal Study of 1988 (NELS:88) science and math test data for illustration of the five methods of accounting for measurement error. By the end of the project, the researchers expect to release a user-friendly version of the software for running each of the methods and disseminate the results at conferences and in peer-reviewed journals.

Products and Publications


Journal article, monograph, or newsletter

Chen, Y., Li, X., Liu, J., Xu, G., and Ying, Z. (2017). Exploratory Item Classification Via Spectral Graph Clustering. Applied Psychological Measurement, 41(8), 579–599.

Lu, J., & Wang, C. (2020). A response time process model for not-reached and omitted items. Journal of Educational Measurement, 57(4), 584–620.

Wang, C., & Lu, J. (2021). Learning attribute hierarchies from data: Two exploratory approaches. Journal of Educational and Behavioral Statistics, 46(1), 58–84.

Wang, C., & Nydick, S. W. (2020). On longitudinal item response theory models: A didactic. Journal of Educational and Behavioral Statistics, 45(3), 339–368.

Wang, C., and Weiss, D.J. (2018). Multivariate Hypothesis Testing Methods for Evaluating Significant Individual Change. Applied Psychological Measurement, 42(3): 221–239.

Wang, C., & Zhang, X. (2019). A Note on the Conversion of Item Parameters Standard Errors. Multivariate Behavioral Research, 54, 307–321.

Wang, C., Chen, P., & Huebner, A. (2021). Stopping rules for multi-category computerized classification testing. British Journal of Mathematical and Statistical Psychology, 74(2), 184–202.

Wang, C., Chen, P., & Jiang, S. (2020). Item Calibration Methods with Multiple Sub-Scale Multistage Testing. Journal of Educational Measurement,, 57(1), 3–28.

Wang, C., Xu, G., & Zhang, X. (2019). Correction for Item Response Theory Latent Trait Measurement Error in Linear Mixed Effects Models. Psychometrika, 84, 673–700.

Wang, C., Xu, G., Shang, Z., and Kuncel, N. (2018). Detecting Aberrant Behavior and Item Preknowledge: A Comparison of Mixture Modeling Method and Residual Method. Journal of Educational and Behavioral Statistics.

Xu, G., Chiou, S.H., Huang, C.Y., Wang, M.C., and Yan, J. (2017). Joint Scale-Change Models for Recurrent Events and Failure Time. Journal of the American Statistical Association, 112(518), 794–805.

Xu, G., and Shang, Z. (2017). Identifying Latent Structures in Restricted Latent Class Models. Journal of the American Statistical Association

Zhang, X., Tao, J., Wang, C., & Shi, N-Z. (2019). Bayesian Model Selection Methods for Multilevel IRT Models: A Comparison of Five DIC-Based Indices. Journal of Educational Measurement, 56(1) p3–27

Zhang, X., & Wang, C. (2021). Measurement bias and error correction in a two-stage estimation for multilevel IRT models. British Journal of Mathematical and Statistical Psychology, 74, 247–274.

Zhang, X., Wang, C., and Tao, J. (2018). Assessing Item-Level Fit for Higher Order Item Response Theory Models. Applied Psychological Measurement.

Zhang, X., Wang, C., Weiss, D. J., & Tao, J. (2021). Bayesian inference for IRT models with non-normal latent trait distributions. Multivariate Behavioral Research, 56(5), 703–723.

Working paper

Gu, Y., and Xu, G. (2018). Partial Identifiability of Restricted Latent Class Models. arXiv preprint arXiv:1803.04353.