Non-Linear Multilevel Latent Variable Modeling with a Metropolis-Hastings Robbins-Monro Algorithm
Co-Principal Investigator: Michael Seltzer (UCLA)
The goal of this project is to bring together the benefits of multilevel modeling and latent variable modeling. To do so, the project proposes a flexible nonlinear multilevel latent variable modeling framework under which: (1) random effects and latent variables are treated synonymously because both represent unobserved heterogeneity; (2) a nonlinear random effect regression model permits the specification and testing of important structural relations (e.g. mediation or moderation effects) in latent variables; and (3) both the outcome variable and the predictors (at any level) can be latent variables measured with fallible indicators.
This nonlinear model provides flexibility as: (1) the measurement models for latent variables are derived directly from multidimensional item response theory (IRT) allowing multiple types of observed variables: continuous, ordinal, nominal, count, etc.; and (2) general nonlinear functional form is allowed at each level, including product interactions, polynomial effects, and nonlinear regression functions involving latent variables. It seeks to provide a systematic solution to measurement and modeling issues (e.g., attenuation problems connected with measurement error in predictors) routinely encountered in cluster-based experimental and quasi-experimental studies, and studies of schooling based on large-scale longitudinal and cross-sectional data sets (e.g., studies in which cross-level interactions and contextual effects are of particular interest).
The project also seeks to contribute to the speed of statistical computation in multilevel modeling through the use of a computationally efficient Metropolis-Hastings Robbins-Monro algorithm (MH-RM) to tackle the high-dimensional integration problem inherent in likelihood based estimation and inference for such a general model. The MH-RM algorithm combines elements of Markov chain Monte Carlo (MCMC), widely used Bayesian statistics, with Stochastic Approximation, an optimization method used in engineering.
To reach its goal, the project will carry out the following steps. First, it will derive theoretical properties of the proposed model with a focus on identification and substantive interpretability of the parameters. The modeling framework will be implemented in the C++ programming language. Second, it will extend the MH-RM algorithm and optimize it for use with the nonlinear multilevel latent variable model. The algorithm will be implemented in C++. Third, it will conduct simulation studies to test the performance of the algorithm and define the conditions under which the model can be applied. Fourth, it will develop new model checking diagnostic procedures targeted at model-data fit. Fifth, the developed software and methods will be used to analyze large-scale educational data sets (e.g., ECLS-K, LSAY, and PISA) to empirically illustrate them and to contrast the results with those from analyses using observed predictors. In addition, the efficiency of the MH-RM based program will be compared with other available programs (e.g., WinBUGS, Mplus, or Gllamm).