Panel Paper: Schools, Classrooms and Evaluators: Examining the Sources of Variation in Teacher Observation Scores in Chicago

Thursday, November 3, 2016 : 10:20 AM
Columbia 4 (Washington Hilton)

*Names in bold indicate Presenter

Matthew Steinberg, University of Pennsylvania and Jennie Jiang, University of Chicago


Newly implemented evaluation systems are holding teachers more accountable than ever for the performance of their students. In these systems, the majority of a teacher’s evaluation rating upon which retention and tenure decisions are based depends on subjective, classroom observations of a teacher’s instructional practice (Steinberg & Donaldson, in press). Evidence suggests that teachers assigned to lower-achieving students receive lower observation scores (Whitehurst, Chingos, & Lindquist, 2014; Steinberg & Garrett, in press); however, no attention has yet been given to whether observable differences between teachers and their evaluators adversely affect teachers’ observation scores.

In this paper, we address the following: (1) To what extent are differences in teacher observation scores due to the nonrandom assignment of teachers and evaluators to schools and classrooms? (2) To what extent does the teacher/evaluator match influence teacher observation scores? Prior evidence suggests that the race and gender match between students and their teachers affects student achievement (Dee, 2004; Dee, 2005). We extend this line of research by examining whether assignment to a demographically similar evaluator affects teacher performance.

 We employ teacher-, evaluator-, and student-level administrative data from Chicago Public Schools (CPS) from the first two years – 2013-14 and 2014-15 – of CPS’ newly implemented teacher evaluation system, Recognizing Educators Advancing Chicago’s Students (REACH). For each classroom observation, we observe whether it is formal or informal in nature, as well as the scores a teacher receives on nine components (across two domains – Instruction and Classroom Management) of Danielson’s Framework for Teaching (FFT) classroom observation protocol. We focus on three dimensions in which teachers and their evaluators may be observationally different, including: (i) gender; (ii) race; and (iii) age.

CPS teachers annually receive observation scores from observations of multiple classroom lessons. To identify the race-, gender-, and age-specific effects of teacher-evaluator matches on observed measures of teacher performance, we rely on within-year, within-classroom (i.e., within-teacher) variation in teacher-evaluator matches. Notably, our identification strategy relies on the fact that for some teachers, observations of different classroom lessons are conducted by different evaluators. This particular feature of the classroom observation process in CPS allows us to employ a teacher fixed effects approach, which enables us to account for any influence that fixed, unobserved teacher characteristics (including a teacher’s instructional ability) may have on measured performance, as well as any influence that classroom-specific effects may have on observation scores. Further, the fact that our estimates are generated from multiple observations conducted within the same school year mitigates any concern for bias that may be due to changes in teacher practice over time.

This work has important implications for the equitable implementation of newly reformed teacher evaluation systems which rely heavily on subjective evaluations of a teacher’s classroom practice. Indeed, if demographic differences between the teacher and his/her observer affect a teacher’s evaluation ratings, then the ability of newly implemented evaluation systems to fairly evaluate teacher performance and equitably make high-stakes accountability decisions based on classroom observations may be called into question.