Is the System Solid? Assessing the Validity, Reliability, and Bias of Teacher Evaluation Measures

Upton, Rachel; Upton, Rachel

The implementation of new educator evaluation systems is driving states and school districts to seek innovative ways to evaluate teachers and provide useful, timely feedback to improve classroom instruction. School districts have identified creative solutions to conduct frequent teacher evaluations, including the use of peer observers and teacher leaders to conduct observations and use evaluation results to support teacher development. Despite widespread implementation of new systems, little research evaluates the validity and reliability of peer observer and teacher leader observation ratings. This paper evaluates the outcomes of one school district’s teacher evaluation system, focusing on the evaluation observation rubric as well as ratings provided by (n=597) teacher observers.

We examined 2014-15 teacher observation data from Colorado schools. To assess the construct validity and reliability of the teacher evaluation observation rubric, exploratory and confirmatory factor analysis were employed. Internal consistency estimates were assessed using Cronbach’s alpha. Three-level random effects ANOVA models were used to generate empirical Bayes estimates of the intercepts of individual teacher observers. Additionally, multiple linear regression models were used to identify predictors impacting teacher effectiveness scores provided by observers identified as reporting extremely high or low teacher ratings.

Association for Public Policy Analysis & Management

Panel Paper: Is the System Solid? Assessing the Validity, Reliability, and Bias of Teacher Evaluation Measures