Skip to main content

Breadcrumb

Home arrow_forward_ios Information on IES-Funded Research arrow_forward_ios Approaches for Weighting and Estima ...
Home arrow_forward_ios ... arrow_forward_ios Approaches for Weighting and Estima ...
Information on IES-Funded Research
Grant Closed

Approaches for Weighting and Estimation of Public-release Education Data using Two-level Covariance Structure Models

NCER
Program: Statistical and Research Methodology in Education
Program topic(s): Core
Award amount: $159,620
Principal investigator: Laura Stapleton
Awardee:
University of Maryland, College Park
Year: 2011
Project type:
Methodological Innovation
Award number: R305D110050

Purpose

This project identified best methods for estimating parameters and their sampling variances when using multilevel analyses with data collected via complex sampling designs typically used in education research.

Project Activities

The project has four specific aims to address multilevel analysis with complex sample data. First, the project quantified the effects of ignoring the sampling design when using a multilevel model on estimates of parameters and sampling variances through a Monte Carlo simulation. Bias of estimates were examined across a range of typical sampling designs and population characteristics found with education-related datasets. A simulation study was conducted to determine the levels of bias found in parameter and sampling variance estimates when using multilevel covariance structure modeling with complex sample data ignoring the sample design. The first step in developing the simulation study consisted of an extensive review of education-related datasets to define the values used within the conditions of the simulation study as explained in the research plan.

People and institutions involved

IES program contact(s)

Allen Ruby

Associate Commissioner for Policy and Systems
NCER

Products and publications

Journal article, monograph, or newsletter

Stapleton, L.M. (2012). Evaluation of Conditional Weight Approximations for Two-Level Models. Communications in Statistics: Simulation and Computation, 41: 182-204.

Stapleton, L. M., & Kang, Y. (2018). Design effects of multilevel estimates from national probability samples. Sociological Methods & Research, 47(3), 430-457.

Additional project information

Previous award details:

Previous award number:
R305D110046
Previous awardee:
University of Maryland, Baltimore County

Supplemental information

Traditional estimation of multilevel models assumes that school data are a function of random selection and that student data are obtained via random selection within schools. These assumptions are violated with typical national survey sampling designs, and parameter estimates and their sampling variances may be biased under traditional estimation. For example, most national education- related datasets use sampling procedures that are much more complicated in design. With a three-stage sample, primary sampling units (PSUs) of geographic areas are first selected, then schools within those PSUs as secondary sampling units (SSUs) are selected, and finally teachers or students within those SSUs are selected as the ultimate sampling units (USUs). With a two-stage sample, the schools are typically selected as PSUs directly. Additionally, at each stage of selection, stratification of the population elements is used in selecting the sample. This stratum information may or may not be included in a researcher's statistical model.

Appropriate methods to model data from multi-stage stratified sampling designs have been proposed (e.g., multilevel pseudo-maximum likelihood [MPML]), but have not been tested under conditions similar to those found with national education-related datasets. These methods require sampling weights at both student and school levels and these level-1 and level-2 weights often are not found on public-release datasets.

Second, the project determined the best method of level-1 and level-2 sampling weight approximation from the available overall (unconditional) sampling weights found on public-release datasets. This was accomplished by comparing the approximated values with the known values from simulated data. From the simulated data introduced in Aim 1, unconditional USU sampling weights will be used to approximate conditional weights for the USU and SSU (if a 3-stage design) or PSU (if a 2-stage design). Bias in these estimates was determined by correlating the known weights to the approximations.

Third, the project determined the most robust method of sampling variance estimation by comparing the performance of a sandwich estimator with replication methods. Bias found with each technique was expected to vary with the data and sampling conditions. Typical conditions with education- related datasets were examined using Monte Carlo simulations. The simulated data (introduced in Aim 1) was analyzed with the MPML method and with three different approaches to sampling variance estimation (linearized, jackknife replication and bootstrap replication) to determine the method that yields the least bias in sampling variances. This provides an adequate 95 percent confidence interval coverage rates of the parameters of interest.

Fourth, the project examined the performance of the scaled change in chi-squared test statistic in model selection, both under conditions of taking the sampling design into account and not. The model fit for models run for Aims 1 and 3 was examined in comparison to the fit of six other misspecified models: three over-specified and three under-specified.

Questions about this project?

To answer additional questions about this project or provide feedback, please contact the program officer.

 

Tags

Data and AssessmentsMathematics

Share

Icon to link to Facebook social media siteIcon to link to X social media siteIcon to link to LinkedIn social media siteIcon to copy link value

Questions about this project?

To answer additional questions about this project or provide feedback, please contact the program officer.

 

You may also like

Zoomed in IES logo
Workshop/Training

Data Science Methods for Digital Learning Platform...

August 18, 2025
Read More
Zoomed in IES logo
Workshop/Training

Meta-Analysis Training Institute (MATI)

July 28, 2025
Read More
Zoomed in Yellow IES Logo
Workshop/Training

Bayesian Longitudinal Data Modeling in Education S...

July 21, 2025
Read More
icon-dot-govicon-https icon-quote