Skip Navigation

Publications & Products

Search Results: (1-3 of 3 records)

 Pub Number  Title  Date
REL 2017191 The content, predictive power, and potential bias in five widely used teacher observation instruments
This study was designed to inform decisions about the selection and use of five widely-used teacher observation instruments. The purpose was to explore (1) patterns across instruments in the dimensions of instruction that they measure, (2) relationships between teachers' scores in specific dimensions of instruction and their contributions to student achievement growth (value-added), and (3) whether teachers' observation ratings depend on the types of students they are assigned to teach. Researchers analyzed the content of the Classroom Assessment Scoring System (CLASS), Framework for Teaching (FFT), Protocol for Language Arts Teaching Observations (PLATO), Mathematical Quality of Instruction (MQI), and UTeach Observational Protocol (UTOP). The content analysis then informed correlation analyses using data from the Gates Foundation's Measures of Effective Teaching (MET) project. Participants were 5,409 4th-9th grade math and English language arts (ELA) teachers from six school districts. Observation ratings were correlated with teachers' value-added scores and with three composition measures: proportions of nonwhite students, low-income students, and low achieving students in the classroom. Results show that eight of ten dimensions of instruction are captured in all five instruments, but instruments differ in the number and types of elements they assess within each dimension. Observation ratings in all dimensions with quantitative data were significantly but modestly correlated with teachers' value-added scores—with classroom management showing the strongest and most consistent correlations. Finally, among teachers who were randomly assigned to groups of students, observation ratings for some instruments were associated with the proportion of nonwhite and lower achieving students in the classroom, more often in ELA classes than in math classes. Findings reflect conceptual consistency across the five instruments, but also differences in the coverage and the specific practices they assess within a given dimension. They also suggest that observation scores for classroom management more strongly and consistently predict teacher contributions to student achievement growth than scores in other dimensions. Finally, the results indicate that the types of students assigned to a teacher can affect observation ratings, particularly in ELA classrooms. When selecting among instruments, states and districts should consider which provide the best coverage of priority dimensions, how much weight to attach to various observation scores in their evaluation of teacher effectiveness, and how they might target resources toward particular classrooms to reduce the likelihood of bias in ratings.
11/1/2016
NCEE 20124015 Whether and How to Use State Tests to Measure Student Achievement in a Multi-State Randomized Experiment: An Empirical Assessment Based on Four Recent Evaluations
An important question for educational evaluators is how best to measure academic achievement, the outcome of primary interest in many studies. In large-scale evaluations, student achievement has typically been measured by administering a common standardized test to all students in the study (a “study-administered test”). In the era of No Child Left Behind (NCLB), however, state assessments have become an increasingly viable source of information on student achievement. Using state tests scores can yield substantial cost savings for the study and can eliminate the burden of additional testing on students and teaching staff. On the other hand, state tests can also pose certain difficulties: their content may not be well aligned with the outcomes targeted by the intervention and variation in the content and scale of the tests can complicate pooling scores across states and grades.

This NCEE Reference Report, Whether and How to Use State Tests to Measure Student Achievement in a Multi-State Randomized Experiment: An Empirical Assessment Based on Four Recent Evaluations, examines the sensitivity of impact findings to (1) the type of assessment used to measure achievement (state tests or a study-administered test); and (2) analytical decisions about how to pool state test data across states and grades. These questions are examined using data from four recent IES-funded experimental design studies that measured student achievement using both state tests and a study-administered test. Each study spans multiple states and two of the studies span several grade levels.
10/12/2011
NCEE 20094065 Do Typical RCTs of Education Interventions Have Sufficient Statistical Power for Linking Impacts on Teacher Practice and Student Achievement Outcomes
For RCTs of education interventions, it is often of interest to estimate associations between student and mediating teacher practice outcomes, to examine the extent to which the study's conceptual model is supported by the data, and to identify specific mediators that are most associated with student learning. This paper develops statistical power formulas for such exploratory analyses under clustered school-based RCTs using ordinary least squares (OLS) and instrumental variable (IV) estimators, and uses these formulas to conduct a simulated power analysis. The power analysis finds that for currently available mediators, the OLS approach will yield precise estimates of associations between teacher practice measures and student test score gains only if the sample contains about 150 to 200 study schools. The IV approach, which can adjust for potential omitted variable and simultaneity biases, has very little statistical power for mediator analyses. For typical RCT evaluations, these results may have design implications for the scope of the data collection effort for obtaining costly teacher practice mediators.
10/13/2009
   1 - 3