Skip Navigation

Science, Technology, Engineering, and Mathematics (STEM) Education

Grantees

- OR -

Investigator

- OR -

Goals

- OR -

FY Awards

- OR -

The Cognitive, Psychometric, and Instructional Validity of Curriculum-Embedded Assessments: In-Depth Analyses of the Resources Available to Teachers Within "Everyday Mathematics"

Year: 2009
Name of Institution:
University of Illinois, Chicago
Goal: Measurement
Principal Investigator:
Pellegrino, James
Award Amount: $1,943,388
Award Period: 4 years
Award Number: R305A090111

Description:

Co-Principal Investigators: Susan Goldman, Louis DiBello, William Stout, and Alison Castro

Purpose: Progress toward improving the quality of mathematics teaching and learning has been slow despite the introduction of standards-based curricula, such as Everyday Mathematics, and efforts to support teachers' implementation of reform-based classroom practices. Embedded assessments are considered to be a key element of reform-based curricula with the goal of supporting more effective instructional practices such as making students' thinking visible and providing opportunities for feedback. However, if embedded assessments are to have a positive impact on teaching and learning, they must overcome three central problems of assessment practice: (1) assessment scope and quality; (2) coordination of multiple assessment functions; and (3) support for teacher use of assessment information. In the current study, the researchers will test the validity of embedded assessments within the Everyday Mathematics curriculum by addressing all three problems of assessment practice.

Project Activities: The goal of the project is to test the validity of embedded assessments within the Everyday Mathematics curriculum. The focus will be on three content strands within the curriculum at grades 3 and 5: Number and Numeration; Operations and Computation; and Patterns, Functions, and Algebra. Multiple aspects of validity will be examined including content, criterion, construct, and consequential validity. In addition, data will be gathered to examine converging evidence regarding three forms of validity: cognitive validity, instructional validity, and psychometric validity.

Products: The outcomes of the project include (1) a detailed description of the strengths and limitations of the assessments found within the Everyday Mathematics curriculum accompanied by a blueprint for their improvement and redesign; (2) an empirically and theoretically driven model for how the embedded assessment components of a standards-based math curriculum should be designed, implemented, and evaluated; (3) a research-based evaluation design that is comprehensive and provides specific methods for investigating the validity of embedded assessments, a design that is adaptable to multiple curricula and other areas besides mathematics; and (4) published reports.

Structured Abstract

Setting: The setting for this study is a large, urban school district in Illinois.

Population: The study sample will consist of teachers and students from 16 classrooms each at grades 3 and 5 over each of three data collection years. Thus, for each of the two grade levels, data will be collected across three years from a total of 48 classrooms. The student population is ethnically diverse and contains a high percentage of students from low-income backgrounds.

Intervention: The researchers will work with teachers and curriculum specialists to examine the validity of embedded assessments in the Everyday Mathematics curriculum. The focus will be on three content strands within Everyday Mathematics at grades 3 and 5: Number and Numeration; Operations and Computation; and Patterns, Functions, and Algebra.

Research Design and Methods: Within each of the content strands, the researchers will map each embedded assessment activity to one or more learning goals, and examine the validity of the assessments relative to its structural connection to learning goals and instruction. Traditional notions of validity, including content, criterion, construct, and consequential validity, are being adapted to the context of embedded classroom assessment. Multiple converging sources of evidence will be examined regarding the three aspects of validity that are most salient for classroom assessment: cognitive validity, instructional validity, and psychometric validity. Cognitive validity focuses on the extent to which the embedded assessment taps important forms of mathematical knowledge and skills in ways that are not confounded with other aspects of cognition such as language or working memory load. Instructional validity focuses on the extent to which an assessment supports teaching practice and provides valuable and timely instructional information. Psychometric validity focuses on the extent to which the assessment reliably yields model-based information about student performance, especially for diagnostic purposes.

Data will be collected in three stages. Stage 1 consists of expert analyses of the cognitive and instructional properties of approximately 300 (150 per grade level) embedded assessments relative to their mappings to curricular learning goals by 6 to 8 experts in the field. Stage 2 consists of protocol observation and interviews with individual students. For each assessment selected for Stage 2 (approximately 100 assessments based on findings from Stage 1, representing 50 for each grade level), 16 students will be randomly selected, stratified within school performance quartiles, and be observed and videotaped as they perform the assessment activities. Teacher data using surveys, logs, and interviews will also be gathered as part of Stage 2. In Stage 3, a large-scale study of teachers and students from 16 classrooms at grades 3 and 5 will be collected at each of the three data collection years, for a total of approximately 48 classrooms and 1,200 students each at grades 3 and 5. Students in this study will perform each assessment activity as part of their regular classroom work and teachers will send student response data to the researchers for scoring and analysis. The large-scale study will focus on at least 8 collections of assessments at grades 3 and 5.

Control Condition: There is no control condition.

Key Measures: The key measures include the selected embedded assessments within the Everyday Mathematics curriculum and students' end-of-year mathematics test score on the Illinois Standards Achievement Test (ISAT).

Data Analytic Strategy: Item response theory will be used to model assessment items and persons together and assign to each student a "latent" measure of proficiency. In addition, a Fusion model will be used for skills diagnostic analysis to infer a profile for each student a mastery of a centrally important set of designated knowledge and skills. To test predictive validity, correlations between the embedded assessment data and ISAT scores will be examined.

Products and Publications

Journal article, monograph, or newsletter

DiBello, L.V., Henson, R.A., and Stout, W.F. (2015). A Family of Generalized Diagnostic Classification Models for Multiple Choice Option-Based Scoring. Applied Psychological Measurement, 39 (1): 62–79.

Pellegrino, J.W., DiBello, L.V., and Goldman, S.R. (2016). A Framework for Conceptualizing and Evaluating the Validity of Instructionally Relevant Assessments. Educational Psychologist, 51 (1): 59–81.