NCEE Blog

National Center for Education Evaluation and Regional Assistance

Why We Need Large-Scale Evaluations in Education

By Elizabeth Warner, Team Leader for Teacher Evaluations, NCEE

These days, people want answers to their questions faster than ever, and education research and evaluation is no exception. In an ideal world, evaluating the impact of a program or policy would be done quickly and at a minimal expense. There is increasing interest in quicker-turnaround, low-cost studies – and IES offers grants specifically for these types of evaluations.

But in education research, quicker isn’t always better. It depends, in part, on the nature of the program you want to study and what you want to learn.

Consider the case of complex, multi-faceted education programs.  By their very nature, complex programs may require several years for all of the pieces to be fully implemented. Some programs also may take time to influence behavior and desired outcomes.  A careful and thorough assessment of these programs may require an evaluation that draws on a large amount of data, often from multiple sources over an extended period of time. Though such studies can require substantial resources, they are important for understanding whether and why an investment had an intended effect. This is especially true for Federal programs that involved millions of dollars of taxpayer money.

On August 24, IES plans to release a new report from a large-scale, multi-year study of a complex Federal initiative: the Teacher Incentive Fund (TIF) [1] This report – the third from this evaluation -- will provide estimated program impacts on student achievement after three years of implementation. The Impact Evaluation of the Teacher Incentive Fund is a $13.7 million study that stretches for more than six years and has reported interim findings annually since 2014. (The graphic above is from the a study snapshot of the first TIF evaluation report.) 

A “big study” of TIF was important to do, and here’s why.

Learning from the Teacher Incentive Fund

TIF is a Federal program that provides grants to districts that want to implement performance pay with the goal of improving teacher quality.  TIF grants awarded in 2010 included the requirement that performance pay be based on an educator evaluation system with multiple performance measures consistent with recent research. The grantees were also expected to use the performance measures to guide educator improvement.

The TIF evaluation provides an opportunity not only to learn about impacts on student achievement over time as the grant activities mature but also to get good answers to implementation questions such as:

  • How do districts structure the pay-for-performance bonus component of TIF?
  • Are educators even aware of the TIF pay-for-performance bonuses?
  • Do educators report having opportunities for professional development to learn about the measures and to improve their performance?
  • Do educators change their practice in ways that improve their performance measures? 
  • Are principals able to use their ability to offer pay-for-performance bonuses to hire or retain more effective teachers?  

Analyses to address these questions can suggest avenues for program improvement that a small-scale impact evaluation with limited data collection would miss. 

Studying the Initiative Over Time

With four years of data collection, the TIF evaluation is longer than most evaluations conducted by IES. But the characteristics of the TIF program made that extensive data collection important for a number of reasons.

 

First, TIF’s approach to educator compensation differs enough from the traditional pay structure that it might take time for educators to fully comprehend how it works. They likely need to experience the new performance measures to see how they might score and how that translates in terms of additional money earned. Second, educators might need to receive a performance bonus or see others receive one in order to fully believe it is possible to earn a bonus. (The chart pictured here is from the second report and compares teachers’ understanding with the actual size of the maximum bonuses that they could receive. This chart will be updated in the third report.)

That also might help them better understand what behaviors are needed to earn a bonus.  Finally, time might be needed for educators to respond to all of the intended policy levers of the program, particularly related to recruitment and hiring.

The TIF evaluation is designed to estimate an impact over the full length of the grants as well as provide rich information to improve the program. Sometimes, even after a number of years, some aspects of a program are never fully implemented. Thus, it may take time to see if whether it is even possible to implement a complex policy like TIF with fidelity.  An evaluation that only looks at initial implementation of a complex program may miss important program components that are incorporated or refined with time. Also, it may take several years to determine if the intended educator behaviors and desired outcomes of the policy are realized.

Learning how and why a policy does or does not work is central to program improvement.  For large, complex programs like TIF, this assessment is only possible with data-rich study over an extended period of time.     

 

[1] Under the newly reauthorized Elementary and Secondary Education Act, this program is now called the Teacher and School Leader Incentive Program, but for this blog, we will call it by its original name.