This study, undertaken at the request of the Nevada Department of Education, examined the stability over years of teacher-level growth scores from the Student Growth Percentile (SGP) model, which many states and districts have selected as a measure of effectiveness in their teacher evaluation systems. The authors conducted a generalizability study using three years of data in mathematics and reading for nearly 370 elementary and middle school teachers from Washoe County School District in Reno, Nevada’s second-largest district. The study found that in mathematics, half of the variation among teachers’ annual growth score (median SGPs) was attributable to differences among teachers, while half was random or unstable. In reading, .41 of the variance in annual scores was attributable to differences among teachers, while .59 was due to random or unstable sources. More stable measures of effectiveness can be constructed by averaging multiple years of growth scores for a teacher, and the report provides stability estimates for averages of two, three, and four years of annual scores. The results from this study can also be used to examine the accuracy of judgments of teachers’ effectiveness that are based on these scores. Study results suggest that as states examine properties of their estimates of teacher effectiveness and consider their use in teacher accountability, they may want to be cautious in using such scores for teacher evaluation.