Project Activities
The project has four specific aims to address multilevel analysis with complex sample data. First, the project quantified the effects of ignoring the sampling design when using a multilevel model on estimates of parameters and sampling variances through a Monte Carlo simulation. Bias of estimates were examined across a range of typical sampling designs and population characteristics found with education-related datasets. A simulation study was conducted to determine the levels of bias found in parameter and sampling variance estimates when using multilevel covariance structure modeling with complex sample data ignoring the sample design. The first step in developing the simulation study consisted of an extensive review of education-related datasets to define the values used within the conditions of the simulation study as explained in the research plan.
Second, the project determined the best method of level-1 and level-2 sampling weight approximation from the available overall (unconditional) sampling weights found on public-release datasets. This was accomplished by comparing the approximated values with the known values from simulated data. From the simulated data introduced in Aim 1, unconditional USU sampling weights will be used to approximate conditional weights for the USU and SSU (if a 3-stage design) or PSU (if a 2-stage design). Bias in these estimates was determined by correlating the known weights to the approximations.
Third, the project determined the most robust method of sampling variance estimation by comparing the performance of a sandwich estimator with replication methods. Bias found with each technique was expected to vary with the data and sampling conditions. Typical conditions with education- related datasets were examined using Monte Carlo simulations. The simulated data (introduced in Aim 1) was analyzed with the MPML method and with three different approaches to sampling variance estimation (linearized, jackknife replication and bootstrap replication) to determine the method that yields the least bias in sampling variances. This provides an adequate 95 percent confidence interval coverage rates of the parameters of interest.
Fourth, the project examined the performance of the scaled change in chi-squared test statistic in model selection, both under conditions of taking the sampling design into account and not. The model fit for models run for Aims 1 and 3 was examined in comparison to the fit of six other misspecified models: three over-specified and three under-specified.
People and institutions involved
IES program contact(s)
Products and publications
Journal article, monograph, or newsletter
Stapleton, L.M. (2012). Evaluation of Conditional Weight Approximations for Two-Level Models. Communications in Statistics: Simulation and Computation, 41: 182-204.
Stapleton, L. M., & Kang, Y. (2018). Design effects of multilevel estimates from national probability samples. Sociological Methods & Research, 47(3), 430-457.
Additional project information
Previous award details:
Questions about this project?
To answer additional questions about this project or provide feedback, please contact the program officer.