Skip Navigation

2006Research Conference | June 15–16

This conference highlighted the work of invited speakers, independent researchers who have received grant funds from the Institute of Education Sciences, and trainees supported through predoctoral training grants and postdoctoral fellowships. The presentations are those of the authors and do not necessarily represent the views of the U.S. Department of Education or the Institute of Education Sciences.
Hyatt Regency Washington on Capitol Hill
400 New Jersey Avenue, N.W.
Select Conference Year

Value-Added Models and the Measurement of Teacher Quality: Abstract

Douglas Harris, Tim R. Sass

In the last decade the availability of administrative databases that track individual student achievement over time has radically altered how education research is conducted and has brought fundamental changes to the ways in which educational programs and personnel are evaluated. Prior to the development of the Texas Schools Project by John Kain in the 1990s, studies of student achievement and the role of teachers in student learning was limited largely to cross-sectional analysis of student achievement levels or simple two-period studies of student achievement gains. Now, in addition to Texas, statewide longitudinal databases exist in North Carolina and Florida as well as in large urban school districts such as New York, Chicago, Los Angeles and San Diego. The advent of these longitudinal databases has allowed researchers to measure changes in achievement at the individual student level, thereby controlling for the influences of students and families when evaluating educational programs.

The availability of student-level panel data is also fundamentally changing school accountability and the measurement of teacher performance. In Tennessee and Dallas, models of individual student achievement have been used to measure teacher performance. While the stakes are currently low in these cases, there is growing interest among scholars and policymakers alike to use the measures for high-stakes merit pay, school grades, and other forms of accountability. Denver and Houston have recently adopted merit pay systems based on student performance and Florida plans to implement a statewide system beginning in the 2006-2007 school year. With the federal No Child Left Behind statute, requiring high-stakes testing in all grades 3-8, the use of student-level longitudinal data to estimate value-added models is likely to expand even more rapidly in the coming years.

Existing studies employ a variety of value-added models, yet few studies explicitly state or test the assumptions underlying their models. We do both using an extensive database of students and teachers from Florida. Unlike most state-level administrative databases, these data include not only test scores and demographic and programmatic information for individual students, but information on student enrollment, attendance and disciplinary actions as well. In addition, Florida's Education Data Warehouse incorporates employment records of all school personnel. Both the student and employee information can be linked to specific classrooms. Using these data, we test many of the central assumptions of existing models and determine the impact of alternative methods on measures of teacher quality.

Our results suggest that student and teacher heterogeneity are the most important issues that value-added models must contend with. We confirm the finding of past studies that covariates are inadequate replacements for individual student and teacher effects. Moreover, random effects models yield inconsistent estimates of model parameters due to correlation between the random effects and explanatory variables in the model. The biases introduced by covariate and random effects models extend both to the estimates of the unobserved teacher quality and the effects of time-varying teacher characteristics (experience and professional development) on student achievement. We also reject the exclusion of individual school effects.

The modeling of students' peers and other non-teacher classroom-level factors appear to have relatively little impact on the estimated effects of teacher quality. The same is true of the modeling of lagged school inputs. We also find that the assumed persistence of educational inputs makes little difference, suggesting the choice between simple gain-score models and unrestricted value-added models, may not be very important. We also find that prior test scores serve as a sufficient statistic for past educational inputs, indicating that standard value-added models can be used instead of the more cumbersome cumulative models of achievement.

These results have significant implications for both educational research and policy. First, the importance of individual fixed effects calls into question the common assumptions made by educational researchers and sophisticated accountability systems such as those in Dallas and Tennessee. But perhaps the most significant problems in using value-added models for accountability are that school effects appear to play an important role and that teachers are non-randomly assigned to schools. The first finding implies that, if school effects are excluded from the models, then the teacher effects are biased and capture factors that appear to be outside the control of the teachers. However, the second fact means that, if the school effects are included in the models, then it is possible only to compare teachers within schools, which may create unproductive competition between teachers. Thus, there appears to be a fundamental trade-off between these two approaches for the purposes of accountability.

The implications of our results for research on teacher quality are somewhat clearer. By testing the assumptions of past models, we have narrowed the range of justifiable models as well as the data requirements that must be met in order to estimate them. Given the coming expansion of standardized testing and improved database capabilities, the importance of understanding value-added modeling will only continue to grow.