Skip Navigation

Publications & Products

Search Results: (1-4 of 4 records)

 Pub Number  Title  Date
REL 2017191 The content, predictive power, and potential bias in five widely used teacher observation instruments
This study was designed to inform decisions about the selection and use of five widely-used teacher observation instruments. The purpose was to explore (1) patterns across instruments in the dimensions of instruction that they measure, (2) relationships between teachers' scores in specific dimensions of instruction and their contributions to student achievement growth (value-added), and (3) whether teachers' observation ratings depend on the types of students they are assigned to teach. Researchers analyzed the content of the Classroom Assessment Scoring System (CLASS), Framework for Teaching (FFT), Protocol for Language Arts Teaching Observations (PLATO), Mathematical Quality of Instruction (MQI), and UTeach Observational Protocol (UTOP). The content analysis then informed correlation analyses using data from the Gates Foundation's Measures of Effective Teaching (MET) project. Participants were 5,409 4th-9th grade math and English language arts (ELA) teachers from six school districts. Observation ratings were correlated with teachers' value-added scores and with three composition measures: proportions of nonwhite students, low-income students, and low achieving students in the classroom. Results show that eight of ten dimensions of instruction are captured in all five instruments, but instruments differ in the number and types of elements they assess within each dimension. Observation ratings in all dimensions with quantitative data were significantly but modestly correlated with teachers' value-added scores—with classroom management showing the strongest and most consistent correlations. Finally, among teachers who were randomly assigned to groups of students, observation ratings for some instruments were associated with the proportion of nonwhite and lower achieving students in the classroom, more often in ELA classes than in math classes. Findings reflect conceptual consistency across the five instruments, but also differences in the coverage and the specific practices they assess within a given dimension. They also suggest that observation scores for classroom management more strongly and consistently predict teacher contributions to student achievement growth than scores in other dimensions. Finally, the results indicate that the types of students assigned to a teacher can affect observation ratings, particularly in ELA classrooms. When selecting among instruments, states and districts should consider which provide the best coverage of priority dimensions, how much weight to attach to various observation scores in their evaluation of teacher effectiveness, and how they might target resources toward particular classrooms to reduce the likelihood of bias in ratings.
REL 2016180 Predicting math outcomes from a reading screening assessment in grades 3–8
District and state education leaders and teachers frequently use assessments to identify students who are at risk of performing poorly on end-of-year reading achievement tests. This study explores the use of a universal screening assessment of reading skills for the identification of students who are at risk for low achievement in mathematics and provides support for the interpretation of screening scores to inform instruction. The study results demonstrate that a reading screening assessment predicted poor performance on a mathematics outcome (the Stanford Achievement Test) with similar levels of accuracy as screening assessments that specifically measure mathematics skills. These findings indicate that a school district could use an assessment of reading skills to screen for risk in both reading and mathematics, potentially reducing costs and testing time. In addition, this document provides a decision tree framework to support implementation of screening practices and interpretation by teachers.
REL 2016126 Stated Briefly: Who will succeed and who will struggle? Predicting early college success with Indiana’s Student Information System
This "Stated Briefly" report is a companion piece that summarizes the results of another report of the same name. This study examined whether data on Indiana high school students, their high schools, and the Indiana public colleges and universities in which they enroll predict their academic success during the first two years in college. The researchers obtained student-level, school-level, and university-related data from Indiana's state longitudinal data system on the 68,802 students who graduated high school in 2010. For the 32,564 graduates who first entered a public 2-year or 4-year college, the researchers examined their success during the first two years of college using four indicators of success: (1) enrolling in only nonremedial courses, (2) completion of all attempted credits, (3) persistence to the second year of college, and (4) an aggregation of the other three indicators. HLM was used to predict students' performance on indicators using students' high school data, information about their high schools and information about the colleges they first attended. Half of Indiana 2010 high school graduates who enrolled in a public Indiana college were successful by all indicators of success. College success differed by student demographic and academic characteristics, by the type of college a student first entered, and by the indicator of college success used. Academic preparation in high school predicted all indicators of college success, and student absences in high school predicted two individual indicators of college success and a composite of college success indicators. While statistical relationships were found, the predictors collectively only predicted less than 35 percent of the variance. The predictors from this study can be used to identify students who will likely struggle in college, but there will likely be false positive (and false negative) identifications. Additional research is needed to identify other predictors--possibly non-cognitive predictors--that can improve the accuracy of the identification models.
REL 2013008 Evaluating the screening accuracy of the Florida Assessments for Instruction in Reading (FAIR)
This report analyzed student performance on the FAIR reading comprehension screen across grades 4-10 and the Florida Comprehensive Assessment Test (FCAT) 2.0 to determine how well the FAIR and the 2011 FCAT 2.0 scores predicted 2012 FCAT 2.0 performance. The first key finding was that the reading comprehension screen of the Florida Assessments for Instruction in Reading (FAIR) was more accurate than the 2011 Florida Comprehensive Assessment Test (FCAT) 2.0 scores in correctly identifying students as not at risk for failing to meet grade-level standards on the 2012 FCAT 2.0. The second key finding was that using both the FAIR screen and the 2011 FCAT 2.0 lowered the underidentification rate of at-risk students by 12–20 percentage points compared with the results using the 2011 FCAT 2.0 score alone.
   1 - 4