Indiana University SPEA Edward J. Bloustein School of Planning and Public Policy University of Pennsylvania AIR American University

Poster Paper: The Diagnostic Accuracy of Performance-Based Teacher Retention Policy

Saturday, November 14, 2015
Riverfront South/Central (Hyatt Regency Miami)

*Names in bold indicate Presenter

J. Edward Guthrie and Gary Henry, Vanderbilt University
Although a growing body of research estimates large benefits in overall student achievement from basing teacher retention on value-added performance estimates, the efficiency of such policies depends on the accurate, binary classification of teachers most likely to be effective if retained. This paper adds to this line of policy research by explicitly focusing on maximizing the accuracy of this categorical identification using multiple sources of information on teacher performance, and by presenting both the positive and negative effects of potential retention policy in terms of diagnostic accuracy. Approaching teacher retention policy from this perspective offers a number of practical and political advantages. First, it provides a framework for incorporating multiple predictors of teacher performance into retention decisions. While research has contrasted the potential effects of retention policies based on various value-added performance thresholds, it has not yet considered how other measures of teacher quality can be incorporated to improve the efficiency with which we identify teachers for retention. Though Hanushek (2009) and Staiger and Rockoff (2010) provide evidence that even personnel decisions based on “imperfect” information would yield large system-wide gains in student achievement, improving the diagnostic accuracy of the test conditions upon which these decisions are made can make policies both more politically palatable and more efficient in producing student gains at a lower cost to teachers and education agencies.

Further, while the advocates of either seniority-based or performance-based retention policies rely on strength-of-association measures and generalized “on average” outcomes to defend their positions, the opposition of skeptics to both sides is based in fear of inaccurate identification of highest- and lowest-performing teachers on case-wise bases. Critics of performance-based retention often express their concerns in terms of instability in the quantile-rank of individual teachers’ value-added ratings over time or by problematic case studies of teachers dismissed where performance-based policies are already in effect. Conversely, advocacy of performance-based retention is often motivated by cases of chronically low-performing but difficult-to-fire tenured teachers, leading to the infamous “rubber rooms” of New York City and “dance of the lemons” as districts shield and shuffle their least effective employees. Our diagnostic accuracy approach states policy effects in rates of success and failure of binary treatment decisions which more directly address these stated positions.

The methodology and language for determining the diagnostic accuracy of tests have been developed primarily in the field of medicine, but have applicability to education policy in cases involving binary treatment decisions and significant associated costs. Considering teacher retention policies from the perspective of diagnostic accuracy allows us to understand effects at the individual-teacher level and explicitly contrast the costs and benefits of various retention rules. Simulating performance-based retention decisions retroactively using administrative performance data and novice-level replacements, this paper first explores the diagnostic accuracy of retention rules proposed in previous literature. We then extend these simulations to consider the extent to which including multiple years of performance data and multiple measures of teacher performance (including principal ratings and student survey feedback) improve the diagnostic accuracy of performance-based teacher retention.