REL 2015030 Redesigning Teacher Evaluation: Lessons from a Pilot Implementation
REL Northeast and Islands, in collaboration with the Northeast Educator Effectiveness Research Alliance and the New Hampshire Department of Education, conducted a study of the implementation of new teacher evaluation systems in New Hampshire’s School Improvement Grant (SIG) schools. While the basic system features are similar across district plans, the specifics of these features vary considerably by district. District fidelity to the plans, as measured by the exposure of teachers to different features of the evaluation system, ranged from moderate to high. Researchers identified several factors related to implementation: capacity of administrators to conduct evaluations; initial and on-going evaluator training; the introduction and design of student learning objectives; and the professional climate of schools, including the support of the new system by teachers and evaluators.
REL 2015047 The Utility of Teacher and Student Surveys in Principal Evaluations: An Empirical Investigation
This study examined whether adding student and teacher survey measures to existing principal evaluation measures increases the overall power of the principal evaluation model to explain variation in student achievement across schools. The study was conducted using data from 2011-12 on 39 elementary and secondary schools within a midsize urban school district in the Midwest. The research team used the results of the district’s Tripod student and teacher surveys to construct six school-level measures of school conditions that prior research has shown to associate with effective school leadership. The study finds that adding the full set of six survey measures as a group results in statistically significant increases in variance explained in mathematics and composite value-added outcomes, but not in reading. A stepwise regression procedure identified two measures – instructional leadership and classroom instructional environment – as an optimal subset of the six measures. This evidence indicates that student and teacher survey measures can have utility for principal performance evaluation.
NCEE 20144017 Understanding Variation in Treatment Effects in Education Impact Evaluations: An Overview of Quantitative Methods
This report summarizes the complex research literature on quantitative methods for assessing how impacts of educational interventions on instructional practices and student learning differ across students, educators, and schools. It also provides technical guidance about the use and interpretation of these methods. The research topics addressed include: subgroup (moderator) analyses based on study participants’ characteristics measured before the intervention is implemented; subgroup analyses based on study participants’ experiences, mediators, and outcomes measured after program implementation; and impact estimation when treatment effects vary. The focus is on randomized controlled trials, but the methods are also applicable to quasi-experimental designs.
REL 2014007 Logic Models: A Tool for Designing and Monitoring Program Evaluations
This quick reference guide defines the major components of education programs—resources, activities, outputs, and short-, mid-, and long-term outcomes—and uses an example to demonstrate the relationships among them.
REL 20124010 An Investigation of the Impact of the 6+1 Trait® Writing Model on Grade 5 Student Writing Achievement
Reading, writing, and arithmetic have long been considered the foundation, or “basics,” of education in the United States. Writing skills are important for an increasing number of jobs. Poor writing skills are a barrier to hiring and promotion for many individuals, and remediation of problems with writing imposes significant operational and training costs on public and private organizations. Writing is also important for the development of reading skills and can improve learning in other academic content areas. In response to the perceived neglect of writing in U.S. education, the National Commission on Writing proposed a set of recommendations for making writing a central element in school reform efforts. These concerns were echoed in regional needs assessment studies conducted by Regional Educational Laboratory (REL) Northwest, in which educators in the region placed a high priority on writing and literacy education.
NCEE 20124015 Whether and How to Use State Tests to Measure Student Achievement in a Multi-State Randomized Experiment: An Empirical Assessment Based on Four Recent Evaluations
An important question for educational evaluators is how best to measure academic achievement, the outcome of primary interest in many studies. In large-scale evaluations, student achievement has typically been measured by administering a common standardized test to all students in the study (a “study-administered test”). In the era of No Child Left Behind (NCLB), however, state assessments have become an increasingly viable source of information on student achievement. Using state tests scores can yield substantial cost savings for the study and can eliminate the burden of additional testing on students and teaching staff. On the other hand, state tests can also pose certain difficulties: their content may not be well aligned with the outcomes targeted by the intervention and variation in the content and scale of the tests can complicate pooling scores across states and grades.

This NCEE Reference Report, Whether and How to Use State Tests to Measure Student Achievement in a Multi-State Randomized Experiment: An Empirical Assessment Based on Four Recent Evaluations, examines the sensitivity of impact findings to (1) the type of assessment used to measure achievement (state tests or a study-administered test); and (2) analytical decisions about how to pool state test data across states and grades. These questions are examined using data from four recent IES-funded experimental design studies that measured student achievement using both state tests and a study-administered test. Each study spans multiple states and two of the studies span several grade levels.
NCEE 20124016 Estimating the Impacts of Educational Interventions Using State Tests or Study-Administered Tests
State assessments provide a relatively inexpensive and increasingly accessible source of data on student achievement. In the past, rigorous evaluations of educational interventions typically administered standardized tests selected by the researchers ("study-administered tests") to measure student achievement outcomes. Increasingly, researchers are turning to the lower cost option of using state assessments for measures of student achievement.
NCEE 20114002 Achievement Effects of Four Early Elementary Math Curricula: Findings for First and Second Graders
The restricted-use data file for this report contains data for the 2006-07 and 2007-08 schools years for 4 core early elementary mathematics curricula implemented in 1st and 2nd grades. Data includes teacher surveys, teacher math knowledge, classroom observation, and student mathematics achievement.
NCEE 20114001 Achievement Effects of Four Early Elementary Math Curricula: Findings for First and Second Graders
According to a national evaluation of four math curricula, among first graders, the results favored Math Expressions over both Investigations and SFAW, but not over Saxon. Among second graders, the results favored Math Expressions and Saxon over SFAW, but not over Investigations.

The four curricula studied include: (1) Investigations in Number, Data, and Space, (2) Math Expressions, (3) Saxon Math, and (4) Scott Foresman-Addison Wesley Mathematics (SFAW).

The evaluation compared the relative effects, including differences in teacher training, instructional strategies, content coverage, and materials, of these four curricula on the math achievement of first and second graders in 110 schools in 12 participating districts in 10 states. Schools were randomly assigned within each district to implement one of the four curricula in first and second grade. After one year, this study found significant impacts on student achievement of two curricula relative to the other two curricula in the study.

  • The average math achievement of first graders in schools using Math Expressions was higher than in schools using Investigations and SFAW, but not in schools using Saxon. The difference is equivalent to moving a student from the 50th to the 54th percentile.
  • The average math test score for second graders in schools using Math Expressions and in schools using Saxon was higher than that in schools using SFAW, but not in schools using Investigations. The differences are equivalent to moving a student from the 50th to the 55th and 57th percentile, respectively. Based on the curriculum requirements, Saxon teachers reported spending an average of one hour more on math instruction per week than teachers using other curricula, largely due to the extensive daily routines included in the Saxon curricula.
  • Almost all teachers reported using their assigned curriculum and, based on classroom observations, the instructional approaches of teachers in the four curriculum groups differed as expected. Math Expressions blended student-centered and teacher-directed approaches to math instruction, while student-centered instruction and peer collaboration were highest in Investigations classrooms, and teacher-directed instruction was highest in Saxon classrooms.
NCEE 2009006 Survey of Outcomes Measurement in Research on Character Education Programs
Character education programs are school-based programs that have as one of their objectives promoting the character development of students. This report systematically examines the outcomes that were measured in evaluations of a delimited set of character education programs and the research tools used for measuring the targeted outcomes. The multi-faceted nature of character development and many possible ways of conceptualizing it, the large and growing number of school-based programs to promote character development, and the relative newness of efforts to evaluate character education programs using rigorous research methods all combine to make the selection or development of measures relevant to the evaluation of these programs especially challenging. This report is a step toward creating a resource that can inform measure selection for conducting rigorous, cost effective studies of character education programs. The report, however, does not provide comprehensive information on all measures or types of measures, guidance on specific measures, or recommendations on specific measures.
NCEE 2009013 Technical Methods Report: Using State Tests in Education Experiments: A Discussion of the Issues
Securing data on students' academic achievement is typically one of the most important and costly aspects of conducting education experiments. As state assessment programs have become practically universal and more uniform in terms of grades and subjects tested, the relative appeal of using state tests as a source of study outcome measures has grown. However, the variation in state assessments--in both content and proficiency standards--complicates decisions about whether a particular state test is suitable for research purposes and poses difficulties when planning to combine results across multiple states or grades. This discussion paper aims to help researchers evaluate and make decisions about whether and how to use state test data in education experiments. It outlines the issues that researchers should consider, including how to evaluate the validity and reliability of state tests relative to study purposes; factors influencing the feasibility of collecting state test data; how to analyze state test scores; and whether to combine results based on different tests. It also highlights best practices to help inform ongoing and future experimental studies. Many of the issues discussed are also relevant for non-experimental studies.
