The researchers will analyze existing middle school science items from four states and will select two of these states for which to redesign selected items and test the validity of inferences that can be drawn from these items. The researchers will then administer assessments that include original and redesigned items to students with and without disabilities, and analyze the data to determine the extent to which the redesign permits the test to assess science content independent of irrelevant factors associated with the students' disabilities (e.g., sensory or physical impairments that limit test responses, specific academic deficits unrelated to science content, fatigue associated with health impairments).
Setting
The project will be conducted in Kansas, Kentucky, Nevada, and South Carolina.
Sample
The project will randomly select approximately 60 eighth-grade classrooms across the two states selected to participate in the item redesign and validity research. From these classes, students with learning disabilities or mild mental retardation will be selected, and a random sample of general education students will be selected to meet the goal of recruiting 100 students with disabilities and 100 general education students within each state.
A two-group repeated measures design will be used, in which students with and without disabilities will receive different forms of the assessment. The selected students with disabilities as well as an equal number of randomly chosen general education students will be administered standardized tests of reading and mathematics achievement as discussed below, as well as test booklets combining original and redesigned science items from both states. Constructed-response items will be scored according to project-developed rubrics with appropriate safeguards to ensure consistency and reliability of scoring. Students with disabilities will be given test accommodations as specified on their IEPs.
Control condition
Students' scores on the original, unmodified items from state science assessments will serve as the control condition.
Key measures
In addition to the science assessments and items discussed above, the Reading Comprehension and Mathematics Problem-Solving subtests from the abbreviated battery of The Stanford Achievement Test (version 9) will be administered to obtain students' reading and mathematics achievement scores for use as covariates.
Data analytic strategy
The project will analyze the effects of the item redesign by means of varied analyses including logistic regression comparing student groups and item conditions (original vs. redesigned), regression analysis on reading and math achievement, Rasch Item Response Theory modeling to determine if the redesigned items are substantially easier than the original versions, and a structural equations analysis to test if the effects of construct-irrelevant variance are reduced by the item redesigns. Additional analyses will focus on the difficulty of items in original and redesigned forms, comparisons of effects for different types of knowledge, analysis of item features that may predict gains in performance for students with disabilities as compared with general education students, and determining factor structures of original and redesigned items for different student groups, science content, and achievement levels.
Products: The expected outcomes from this study include reports of research findings as well as research-based guidelines that states and assessment developers can use to increase the validity of inferences from science assessment scores for all students. Samples of assessment products used during the redesign process, including design patterns and templates, and newly designed assessment tasks will be shared as well.
Book chapter
Haertel, G.D. Vendlinski, T.P., Rutstein, D., DeBarger, A., Cheng, B.H., Ziker, C., Snow, E.B., D'Angelo C., Harris, C.J., Yarnall, L., and Ructtinger, L. (2016). General Introduction to Evidence-Centered Design. In H.I. Braun (Ed.), Meeting the Challenges to Measurement in an Era of Accountability (pp. 107-148). New York: Routledge.
Haertel, G.D., Vendlinski, T.P., Rutstein, D., DeBarger, A., Cheng, B.H., Snow, E.B., D'Angelo, C., Harris, C., Yarnall, L., and Ructtinger, L. (2016). General Introduction to Evidence-Centered Design. In H.I. Braun (Ed.), Meeting the Challenges to Measurement in an era of Accountability (pp. 107-148). New York: Routledge.
Haertel, G.D., Vendlinski, T.P., Rutstein, D., DeBarger, A., Cheng, B.H., Ziker, C., Harris, C.J., D'Angelo, C., Snow, E.B., Bienkowski, M., and Ructtinger, L. (2016). Assessing the Life Sciences: Using Evidence-Centered Design for Accountability Purposes. In H. Braun (Ed.), Meeting the Challenges to Measurement in an Era of Accountability (pp. 267-327). New York: Routledge.
Journal article, monograph, or newsletter
Mislevy, R.J., Haertel, G., Cheng, B.H., Ructtinger, L., DeBarger, A., Murray, E., Rose, D., Gravel, J., Colker, A.M., Rutstein, D., and Vendlinski, T. (2013). A “Conditional” Sense of Fairness in Assessment. Educational Research and Evaluation, 19(2): 121-140. doi:10.1080/13803611.2013.767614
Funded under the Assessment for Accountability topic prior to the establishment of the Systemic Interventions and Policies for Special Education topic.