|Title:||Principled Science Assessment Designs for Students with Disabilities|
|Principal Investigator:||Haertel, Geneva||Awardee:||SRI International|
|Program:||Systems, Policy, and Finance [Program Details]|
|Award Period:||3/1/2007 to 2/28/2011||Award Amount:||$1,599,939|
Funded under the Assessment for Accountability topic prior to the establishment of the Systemic Interventions and Policies for Special Education topic.
Purpose: The No Child Left Behind Act requires that students with disabilities be included in state assessments and accountability. However, the use of accommodations, modifications, and alternate assessments to permit the inclusion of students with disabilities has given rise to a number of issues related to fairness and test validity. Recently, researchers have begun to explore whether tests can be designed from the outset to be more accessible and valid for a wider range of students; this approach is termed "universal design." The researchers on this project will study the use of universal design paired with an approach termed "evidence-centered design" to develop or redesign items that can more accurately evaluate the knowledge and skills of all students on statewide assessments. The academic content focus of this study is middle school science, but if successful the approach can be applied to other topics and age ranges. The researchers' specific goals are (1) to evaluate the validity of inferences that can be drawn from existing state science assessments for students with and without disabilities, (2) to redesign assessment items to increase the validity for students both with and without disabilities, (3) to conduct empirical studies of the validity of inferences drawn from the scores on the redesigned items, and (4) to develop research-based guidelines that can be used in test development to increase the validity of inferences from science assessment scores for all students.
Project Activities: The researchers will analyze existing middle school science items from four states and will select two of these states for which to redesign selected items and test the validity of inferences that can be drawn from these items. The researchers will then administer assessments that include original and redesigned items to students with and without disabilities, and analyze the data to determine the extent to which the redesign permits the test to assess science content independent of irrelevant factors associated with the students' disabilities (e.g., sensory or physical impairments that limit test responses, specific academic deficits unrelated to science content, fatigue associated with health impairments).
Products: The expected outcomes from this study include reports of research findings as well as research-based guidelines that states and assessment developers can use to increase the validity of inferences from science assessment scores for all students. Samples of assessment products used during the redesign process, including design patterns and templates, and newly designed assessment tasks will be shared as well.
Setting: The project will be conducted in Kansas, Kentucky, Nevada, and South Carolina.
Population: The project will randomly select approximately 60 eighth-grade classrooms across the two states selected to participate in the item redesign and validity research. From these classes, students with learning disabilities or mild mental retardation will be selected, and a random sample of general education students will be selected to meet the goal of recruiting 100 students with disabilities and 100 general education students within each state.
Intervention: Universal Design for Learning improves accessibility by providing multiple means of representing information, multiple means of expression, and multiple means of engagement. Evidence-Centered Design conceives of assessment as an argument from imperfect evidence, and it structures the evidentiary argument by means of five key layers of assessment design and implementation--domain analysis, domain modeling, the conceptual assessment framework, assessment implementation, and assessment delivery. The Principled Assessment Designs for Inquiry System is a previously-developed online tool embodying Evidence-Centered Design and Universal Design for Learning that offers a collection of development resources for designing science assessments. This system employs web-based structures (design patterns, templates, and task specifications) to guide assessment designers through the development of a coherent, evidence-centered assessment.
Research Design and Methods: A two-group repeated measures design will be used, in which students with and without disabilities will receive different forms of the assessment. The selected students with disabilities as well as an equal number of randomly chosen general education students will be administered standardized tests of reading and mathematics achievement as discussed below, as well as test booklets combining original and redesigned science items from both states. Constructed-response items will be scored according to project-developed rubrics with appropriate safeguards to ensure consistency and reliability of scoring. Students with disabilities will be given test accommodations as specified on their IEPs.
Control Condition: Students' scores on the original, unmodified items from state science assessments will serve as the control condition.
Key Measures: In addition to the science assessments and items discussed above, the Reading Comprehension and Mathematics Problem-Solving subtests from the abbreviated battery of The Stanford Achievement Test (version 9) will be administered to obtain students' reading and mathematics achievement scores for use as covariates.
Data Analytic Strategy: The project will analyze the effects of the item redesign by means of varied analyses including logistic regression comparing student groups and item conditions (original vs. redesigned), regression analysis on reading and math achievement, Rasch Item Response Theory modeling to determine if the redesigned items are substantially easier than the original versions, and a structural equations analysis to test if the effects of construct-irrelevant variance are reduced by the item redesigns. Additional analyses will focus on the difficulty of items in original and redesigned forms, comparisons of effects for different types of knowledge, analysis of item features that may predict gains in performance for students with disabilities as compared with general education students, and determining factor structures of original and redesigned items for different student groups, science content, and achievement levels.
Haertel, G.D. Vendlinski, T.P., Rutstein, D., DeBarger, A., Cheng, B.H., Ziker, C., Snow, E.B., D'Angelo C., Harris, C.J., Yarnall, L., and Ructtinger, L. (2016). General Introduction to Evidence-Centered Design. In H.I. Braun (Ed.), Meeting the Challenges to Measurement in an Era of Accountability (pp. 107–148). New York: Routledge.
Haertel, G.D., Vendlinski, T.P., Rutstein, D., DeBarger, A., Cheng, B.H., Snow, E.B., D'Angelo, C., Harris, C., Yarnall, L., and Ructtinger, L. (2016). General Introduction to Evidence-Centered Design. In H.I. Braun (Ed.), Meeting the Challenges to Measurement in an era of Accountability (pp. 107–148). New York: Routledge.
Haertel, G.D., Vendlinski, T.P., Rutstein, D., DeBarger, A., Cheng, B.H., Ziker, C., Harris, C.J., D'Angelo, C., Snow, E.B., Bienkowski, M., and Ructtinger, L. (2016). Assessing the Life Sciences: Using Evidence-Centered Design for Accountability Purposes. In H. Braun (Ed.), Meeting the Challenges to Measurement in an Era of Accountability (pp. 267–327). New York: Routledge.
Journal article, monograph, or newsletter
Mislevy, R.J., Haertel, G., Cheng, B.H., Ructtinger, L., DeBarger, A., Murray, E., Rose, D., Gravel, J., Colker, A.M., Rutstein, D., and Vendlinski, T. (2013). A “Conditional” Sense of Fairness in Assessment. Educational Research and Evaluation, 19(2): 121–140. doi:10.1080/13803611.2013.767614