Skip Navigation
Funding Opportunities | Search Funded Research Grants and Contracts

IES Grant

Title: Validating an Observation Protocol for the Evaluation of Special Educators
Center: NCSER Year: 2015
Principal Investigator: Jones, Nathan Awardee: Boston University
Program: Educators and School-Based Service Providers      [Program Details]
Award Period: 4 years (7/1/2015-6/30/2019) Award Amount: $1,600,000
Type: Measurement Award Number: R324A150231
Description:

Co-Principal Investigators: Bell, Courtney; Brownell, Mary

Purpose: The purpose of this project was to examine the validity of Charlotte Danielson's Framework for Teaching (FFT) for use with special education teachers. The FFT is commonly used to evaluate general and special education teacher instruction, yet most of the research on this tool has only been with general education teachers. Moreover, the extent to which the FFT captures evidence-based instructional approaches for students with disabilities has not been investigated. As such, the goal of the current study was to provide insights into the strengths and limitations of using the FFT with special educators and a greater understanding of the characteristics, processes, and outcomes associated with high-quality special education instruction.

Project Activities: The research team collected videotaped observations of all participating special education teachers. Trained raters then scored the videos using the FFT. Videos were also scored using a special education-specific observation tool, the Quality of Classroom Instruction instrument. Finally, the research team analyzed the data and examined the validity and reliability of the FFT for use with special educators.

Key Outcomes: The main findings of this study, as reported by the principal investigator, are as follows:

  • From a strict measurement perspective, the FFT functions with special educators in ways that are aligned with previous research on the FFT among general educators. This is true of the accuracy and reliability of rater scores, rater bias, the FFT's factor structure, and the generalizability of FFT scores.
  • On FFT's Instruction domain, special educators were rated almost universally low. On a 4-point scale, some components had means categorized between Ineffective (1) and Developing (2) and over 95% scored below the Proficient (3) level. If relying on these scores alone, the conclusion would be that special educators were performing far below expectations of teacher effectiveness.
  • An observation system more closely aligned with special education instruction revealed far more variation in teachers' instructional quality. This suggests that the FFT is not picking up on critical practices known to support students with disabilities, a finding echoed in the research team's systematic content analysis of FFT.
  • Collectively, findings from this study suggest that FFT, as currently written, faces important limitations in its usefulness for informing high-stakes teacher evaluation decisions. Similarly, the observation system is likely not a useful tool for supporting special education teacher improvement.
  • An additional obstacle for the FFT's use in schools was principals' knowledge of special education instruction. Cognitive interviews with principals revealed that building leaders varied widely in how they conceptualized the goals of special education, and these perspectives impacted how principals approached the evaluation process.

Structured Abstract

Setting: Data were collected from elementary and middle schools in Rhode Island.

Sample: Participants included 51 special educators who taught students with high-incidence disabilities in grades 3–8.

Assessment: The FFT is a general observation system that is designed for use across all grade levels and content areas. It specifies four domains of teaching: planning and preparation, classroom environment, instruction, and professional responsibilities. This project focused on validating the instrument's use in the areas of instruction and other professional responsibilities. The FFT is based on an inquiry approach to teaching, whereby teachers are facilitating and students are directing their own learning.

Research Design and Methods: Researchers used Michael Kane's validity argument approach to assess evidence supporting the proposed interpretation and use of FFT scores. Specifically, they examined the following inferences: 1) scoring (to what degree do the scores support their intended purpose of assessing teacher quality?), 2) generalization (is the sample of observations collected sufficiently representative of the pool of teachers' lessons?), 3) extrapolation (do FFT scores converge with a broader conception of teaching quality?), and 4) implication (do FFT teacher quality scores support decisions made based on those scores?). Assumptions behind the first three inferences were tested empirically. Special educators' classrooms were videotaped and scored using the FFT at four different time points throughout the school year. Observations captured reading and math instruction to control for subject-specific effects. Raters scored each videotaped lesson. Their scores were calibrated on a weekly basis by comparing them to a master rater's scores. Scores from the videotaped lessons were used to address the scoring and generalization inferences. The extrapolation inference was assessed by comparing scores on the FFT to a special education-specific observation tool and to student growth scores. The implication inference was investigated by comparing FFT scores generated by study raters to FFT scores assigned by administrators and to overall teacher evaluation scores, which have consequences associated with them.

Control Condition: Due to the nature of the research design, there was no control condition.

Key Measures: The FFT was used to score videotaped observations of special education teacher instruction. The Quality of Classroom Instruction instrument was also used to score teacher instruction and was meant to capture more teacher-directed instruction. Student achievement was assessed using scores on the New England Common Assessment Program, a state standardized test used for school accountability purposes. Student growth percentiles were determined using the Rhode Island Growth Model. Administrative data were collected on teachers (including type of license held, scores on licensure tests, highest degree, number of years teaching, and demographics) and students (including school information, teachers, special education status, and demographics).

Data Analytic Strategy: Inter-rater reliability was assessed and individual raters' patterns of scoring were examined for potential bias. A confirmatory factor analysis and root mean square error of approximation were conducted to assess model fit. Generalizability studies were conducted to investigate sources of variation in ratings for each teacher. The generalizability studies also examined the extent to which FFT scores were related to divergent factors such as subject matter or service delivery model. Correlations were calculated to compare FFT and QCI scores and to student growth percentile scores.

Products and Publications

ERIC Citations: Find available citations in ERIC for this award here.

Project Website: https://sites.bu.edu/setleaders/

Additional Online Resources and Information:
https://wheelockpolicycenter.org/effective-teachers/sped-teacher-eval/

Book chapter

Jones, N. D. & Gilmour, A. (2018). Special education teacher evaluation: Examining current practices and research. In J. B. Crockett, B. S. Billingsley, & M. L. Boscardin (Eds.), Handbook of Leadership and Administration for Special Education. New York, NY: Routledge.

Journal articles

Johnson, E. S., Reddy, L. A., & Jones, N. D. (2021). Introduction to The Special Series: Can Direct Observation Systems Lead to Improvements in Teacher Practice and Student Outcomes? Journal of Learning Disabilities, 54(1): 3–5.

Morris-Mathews, H., Stark, K. R., Jones, N. D., Brownell, M. T., & Bell, C. A. (2021). Danielson's Framework for Teaching: Convergence and Divergence with Conceptions of Effectiveness in Special Education. Journal of Learning Disabilities, 54(1): 66–78.


Back