Project Activities
The researchers developed an online diagnostic assessment system to diagnose sources of students' mathematics deficits and provide information to teachers to guide instruction.
Structured Abstract
Setting
The study was conducted on a representative sample of middle schools in a midwestern state in the U.S.
Sample
The study sample consisted of approximately 3,000 students each in Grades 6 and 7. The students were representative of the population of the midwestern state, with a special focus on students with competencies below curriculum-based performance standards, many of whom were from low income or ethnic minority groups.
The assessment system included a large item bank developed by automatic item generation methods. The assessment system was administered adaptively and on-demand by online computers for efficient and explicit diagnosis. Items in the assessment system were aligned with national standards for basic mathematics, but also have well-specified sources of cognitive complexity that are calibrated in an item response theory (IRT) model. Items were selected adaptively to provide maximum information according to diagnostic IRT models for efficient and valid measurement of each student. The diagnostic assessment system recommended specific lesson plans, prior knowledge exercises, and online student tutorials that are directly related to state standards available from the Blending Assessment with Instruction Project. The Blending Assessment with Instruction Project (BAIP) is widely used by teachers and its direct interface with the diagnostic system allows the seamless collection of validity data.
Research design and methods
The diagnostic system for the assessments was developed by combining standards-based knowledge categorizations of mathematics items with cognitive models of mathematical problem solving. The assessment system consists of seven components: (1) a diagnostic system; (2) item bank; (3) diagnostic IRT model calibrations; (4) adaptive item selection modules; (5) an interactive online test delivery module; (6) a score report module; and (7) a validity module. A series of studies were conducted focusing on assessment development and testing. To examine the external and consequential aspects of validity, the relations between the diagnostic scores and performance on online tutorials were examined using the BAIP system. The adaptive diagnostic assessments were tested with a stratified, random sample of teachers and their students. Students' adaptive diagnostic test scores was linked to scores obtained from the summative state standardized test in mathematics to evaluate the appropriate functioning of the diagnostic assessment system.
Control condition
Due to the nature of this project, there was no control condition.
Key measures
The key measures for the study included students' scores on the diagnostic mathematics assessments and students' end of year mathematics test score on the state standardized assessment.
Data analytic strategy
Structured item response theory (IRT) models examined both the response process and content validity of models used to generate diagnostic items. A diagnostic IRT model was calibrated and implemented for the adaptive testing system. Studies on external aspects of validity were assessed using hierarchical linear models to examine the relation between diagnostic scores of learning processes and the online tutorials. In addition, structural equation modeling was used to study convergent and discriminant validity relationships.
People and institutions involved
IES program contact(s)
Project contributors
Products and publications
Products: The outcomes of the project included an adaptive testing system for identifying sources of middle school students' mathematical deficits that are related to state accountability tests, along with published reports.
Book chapter
Embretson, S.E. (2016). Multicomponent Models. In W. van der Linden, and R. Hambleton (Eds.), Handbook of Item Response Theory (2nd ed., pp. 225-242). New York: Taylor and Francis Inc.
Journal articles
Embretson, S.E. (2015). The Multicomponent Latent Trait Model for Diagnosis: Applications to Heterogeneous Test Domains. Applied Psychological Measurement, 39 (1): 16-30.
Embretson, S.E. (2016). Understanding Examinees' Responses to Items: Implications for Measurement. Educational Measurement: Issues and Practice, 35 (3): 6-22.
Embretson, S.E., and Yang, X. (2013). A Multicomponent Latent Trait Model for Diagnosis. Psychometrika, 78 (1): 14-36.
Morrison, K., and Embretson, S.E. (2014). Abstract: Using Cognitive Complexity to Measure the Psychometric Properties of Mathematics Assessment Items. Multivariate Behavior Research, 49 (3): 292-293.
Nongovernment reports, issue briefs, or practice guides
Embretson, S.E. (2013). Adaptive Diagnosis of Specific Skills on Heterogeneous Tests With Multistage Testing. Atlanta, GA: Cognitive Measurement Laboratory, Georgia Institute of Technology.
Embretson, S.E., and Poggio, J. (2013). An Empirical Evaluation of Automated Item Generation of Test Items Used on Middle School Assessments. Atlanta, GA: Cognitive Measurement Laboratory, Georgia Institute of Technology.
Gillmor, S., Poggio, J., and Embretson, S.E. (2013). Effects of Reducing (Extraneous) Cognitive Complexity of Mathematics Test Items on Student Performance. Atlanta: Cognitive Measurement Laboratory, Georgia Institute of Technology.
Poggio, J., and Embretson, S.E. (2012). Indicators, Benchmarks and Standards in Mathematical Achievement Tests in Middle School: Comparison Across Grade Levels and to Common Core Standards. Atlanta, GA: Cognitive Measurement Laboratory, Georgia Institute of Technology.
Proceeding
Embretson S., Morrison K., Jun H.W. (2015). The Reliability of Diagnosing Broad and Narrow Skills in Middle School Mathematics with the Multicomponent Latent Trait Model. In Quantitative Psychology Research (pp. 17-26).
Questions about this project?
To answer additional questions about this project or provide feedback, please contact the program officer.