Panel Paper:
A Comparison of Teacher Observation Instruments
*Names in bold indicate Presenter
First, the study uses content analysis of instrument text to assess the major differences and similarities in the dimensions of instruction rated by the five observation instruments. Results indicate that seven of ten domains of instructional practice are common across all five observational instruments, demonstrating the conceptual consistency of large parts of the different instruments. However, instruments also differ in how they measure instructional practice within each of the ten dimensions of instruction. The FFT may offer more comprehensive assessment of instruction than the other instruments as, on average, it provided the greatest coverage of elements within a given dimension.
Second, the study uses existing data on 4th-9th grade math and English language arts (ELA) teachers from the Gates Foundation’s Measures of Effective Teaching (MET) project (Kane and Staiger 2012) to examine whether, across different instruments, observation ratings for some dimensions of instruction consistently show stronger correlations with teachers’ value-added scores than others. Among the seven dimensions of instruction with scores available in the MET data, all were significantly, if modestly, associated with teachers’ value-added scores. Classroom management is the dimension that was most consistently and strongly related to teachers’ value-added scores across instruments, subjects, and grades.
Third, the study capitalizes on the random assignment of teachers to groups of students in the second year of the MET study to test the extent to which characteristics of students in the classroom affect instrument ratings, and whether scores for certain instruments, or dimensions of instruction, are more influenced by student characteristics than others. The findings suggest that ratings may be more susceptible to classroom composition for ELA instruction versus for math instruction; when using the FFT instrument as opposed to the CLASS, PLATO, or MQI instruments; and when considering the fraction of nonwhite students in class rather than the share of low-income students or class-average achievement scores. For two of the three instruments used to score ELA instruction (FFT and CLASS), teaching more nonwhite students reduced teachers observation scores; a similar effect was observed on one of instrument (FFT) for teaching lower-achieving students. There was no evidence that classroom composition affects PLATO scores, and there was little evidence of effects in math classes.
The paper discusses implications for state and district selection of observation instruments.