Skip Navigation
Funding Opportunities | Search Funded Research Grants and Contracts

IES Grant

Title: An Adaptive Testing System for Diagnosing Sources of Mathematics Difficulties
Center: NCER Year: 2010
Principal Investigator: Embretson, Susan Awardee: Georgia Institute of Technology
Program: Science, Technology, Engineering, and Mathematics (STEM) Education      [Program Details]
Award Period: 4 years Award Amount: $1,854,393
Type: Measurement Award Number: R305A100234

Co-Principal Investigators: Bruce Walker, John Poggio, Neal Kingston, and Edward Meyen

Purpose: Current standards-based state accountability tests typically find that many students do not meet proficiency standards. However, state accountability tests cannot pinpoint specific sources of students' difficulties so that differentiated instruction may be given to remedy the difficulties. The purpose of this project was to build an online and on-demand adaptive assessment system to diagnose sources of students' mathematics difficulties and deficits that are linked to state standards-based assessments.

Project: The researchers developed an online diagnostic assessment system to diagnose sources of students' mathematics deficits and provide information to teachers to guide instruction.

Products: The outcomes of the project included an adaptive testing system for identifying sources of middle school students' mathematical deficits that are related to state accountability tests, along with published reports.

Structured Abstract

Setting: The study was conducted on a representative sample of middle schools in a midwestern state in the U.S.

Population: The study sample consisted of approximately 3,000 students each in Grades 6 and 7. The students were representative of the population of the midwestern state, with a special focus on students with competencies below curriculum-based performance standards, many of whom were from low income or ethnic minority groups.

Intervention: The assessment system included a large item bank developed by automatic item generation methods. The assessment system was administered adaptively and on-demand by online computers for efficient and explicit diagnosis. Items in the assessment system were aligned with national standards for basic mathematics, but also have well-specified sources of cognitive complexity that are calibrated in an item response theory (IRT) model. Items were selected adaptively to provide maximum information according to diagnostic IRT models for efficient and valid measurement of each student. The diagnostic assessment system recommended specific lesson plans, prior knowledge exercises, and online student tutorials that are directly related to state standards available from the Blending Assessment with Instruction Project. The Blending Assessment with Instruction Project (BAIP) is widely used by teachers and its direct interface with the diagnostic system allows the seamless collection of validity data.

Research Design and Methods: The diagnostic system for the assessments was developed by combining standards-based knowledge categorizations of mathematics items with cognitive models of mathematical problem solving. The assessment system consists of seven components: (1) a diagnostic system; (2) item bank; (3) diagnostic IRT model calibrations; (4) adaptive item selection modules; (5) an interactive online test delivery module; (6) a score report module; and (7) a validity module. A series of studies were conducted focusing on assessment development and testing. To examine the external and consequential aspects of validity, the relations between the diagnostic scores and performance on online tutorials were examined using the BAIP system. The adaptive diagnostic assessments were tested with a stratified, random sample of teachers and their students. Students' adaptive diagnostic test scores was linked to scores obtained from the summative state standardized test in mathematics to evaluate the appropriate functioning of the diagnostic assessment system.

Control Condition: Due to the nature of this project, there was no control condition.

Key Measures: The key measures for the study included students' scores on the diagnostic mathematics assessments and students' end of year mathematics test score on the state standardized assessment.

Data Analytic Strategy: Structured item response theory (IRT) models examined both the response process and content validity of models used to generate diagnostic items. A diagnostic IRT model was calibrated and implemented for the adaptive testing system. Studies on external aspects of validity were assessed using hierarchical linear models to examine the relation between diagnostic scores of learning processes and the online tutorials. In addition, structural equation modeling was used to study convergent and discriminant validity relationships.

Products and Publications

Book chapter

Embretson, S.E. (2016). Multicomponent Models. In W. van der Linden, and R. Hambleton (Eds.), Handbook of Item Response Theory (2nd ed., pp. 225–242). New York: Taylor and Francis Inc.

Journal articles

Embretson, S.E. (2015). The Multicomponent Latent Trait Model for Diagnosis: Applications to Heterogeneous Test Domains. Applied Psychological Measurement, 39 (1): 16–30.

Embretson, S.E. (2016). Understanding Examinees' Responses to Items: Implications for Measurement. Educational Measurement: Issues and Practice, 35 (3): 6–22.

Embretson, S.E., and Yang, X. (2013). A Multicomponent Latent Trait Model for Diagnosis. Psychometrika, 78 (1): 14–36.

Morrison, K., and Embretson, S.E. (2014). Abstract: Using Cognitive Complexity to Measure the Psychometric Properties of Mathematics Assessment Items. Multivariate Behavior Research, 49 (3): 292–293.

Nongovernment reports, issue briefs, or practice guides

Embretson, S.E. (2013). Adaptive Diagnosis of Specific Skills on Heterogeneous Tests With Multistage Testing. Atlanta, GA: Cognitive Measurement Laboratory, Georgia Institute of Technology.

Embretson, S.E., and Poggio, J. (2013). An Empirical Evaluation of Automated Item Generation of Test Items Used on Middle School Assessments. Atlanta, GA: Cognitive Measurement Laboratory, Georgia Institute of Technology.

Gillmor, S., Poggio, J., and Embretson, S.E. (2013). Effects of Reducing (Extraneous) Cognitive Complexity of Mathematics Test Items on Student Performance. Atlanta: Cognitive Measurement Laboratory, Georgia Institute of Technology.

Poggio, J., and Embretson, S.E. (2012). Indicators, Benchmarks and Standards in Mathematical Achievement Tests in Middle School: Comparison Across Grade Levels and to Common Core Standards. Atlanta, GA: Cognitive Measurement Laboratory, Georgia Institute of Technology.


Embretson S., Morrison K., Jun H.W. (2015). The Reliability of Diagnosing Broad and Narrow Skills in Middle School Mathematics with the Multicomponent Latent Trait Model. In Quantitative Psychology Research (pp. 17–26).