Skip Navigation

Regional Educational Laboratory Program

Measuring school leaders’ effectiveness: A multiyear pilot of Pennsylvania’s Framework for Leadership

This series of reports examines the accuracy of performance ratings from the Framework for Leadership (FFL), Pennsylvania’s tool for evaluating the leadership practices of principals and assistant principals. Four key properties of the FFL were analyzed: score variation, internal consistency, year-to-year stability, and concurrent validity. Score variation was characterized by the percentages of school leaders earning scores in different portions of the rating scale. To measure the internal consistency of the FFL, Cronbach’s alpha was calculated for the full FFL and for each of its four categories of leadership practices. Analyses of score stability used data on FFL scores of school years across two years to calculate Pearson’s correlation coefficient. Concurrent validity was assessed through a regression model for the relationship between school leaders’ estimated contributions to student achievement growth and their FFL scores. The first report examined data from the 2012/13 pilot year; the second report is based primarily on the 2013/14 pilot in which 517 principals and 123 assistant principals were rated by their supervisors. As a whole, the results indicate that the FFL is a reliable measure, with good internal consistency and a moderate level of year-to-year stability in scores. There is also evidence of the FFL’s concurrent validity: principals with higher scores on the FFL, on average, make larger estimated contributions to student achievement growth. Higher total FFL scores and scores in two of the four FFL domains are significantly or marginally significantly associated with both value-added in all subjects combined and value-added in math specifically. This evidence of the validity of the FFL sets it apart from other principal evaluation tools: No other measures of principals’ professional practice have been shown to be related to principals’ effects on student achievement. However, in both pilot years, variation in scores was limited, with most school leaders scoring in the upper third of the rating scale. As the FFL is implemented statewide, continued examination of evidence on its statistical properties, especially the variation in scores, is important.
Publication Type:
Making Connections
Online Availability:
Publication Date:
December 2014