Panel Paper: The State of the Art In the Research On Value-Added Models of Teacher Performance

Saturday, November 10, 2012 : 3:30 PM
Salon A (Radisson Plaza Lord Baltimore Hotel)

*Names in bold indicate Presenter

Cassandra Guarino1, Mark Reckase2 and Jeffrey Wooldridge2, (1)Indiana University, (2)Michigan State University


This paper summarizes and explains what is currently known and unknown about the ability of value-added models to quantify a teacher’s contribution to learning.  It presents a synthesis of our findings from a large project regarding methodological problems and solutions in constructing value-added measures of teacher performance using standardized test scores.  In this paper, we outline sources of bias and imprecision in estimating causal teacher effects, and survey the various responses of the research community to addressing these challenges. 

First, we clarify the set of choices faced by those wishing to construct value-added performance measures and the consequences of these choices in a systematic manner.  For example, we discuss the consequences of decisions to use panel versus cross-sectional estimates, whether or not to adjust for student demographics, whether to include peer effects, whether and how to standardize test scores.  We evaluate commonly used techniques for estimating teacher effects, such Empirical Bayes shrinkage, variance correction procedures, HLM and Bayesian approaches, instrumental variables, as well as models based on simple categorizations, such as the “Colorado Growth Model.”  We assess the advantages and disadvantages of each approach and identify common elements across approaches. 

Second, we provide an overview of the sensitivity of value-added performance estimates to assessment-related issues.  Among such issues are non-classical IRT-based measurement error, multidimensionality, and floor and ceiling effects.   Characteristics of tests related to the discrimination, difficulty, and guessing parameters associated with them can influence value-added measures.  In addition, we discuss the nature of variation in performance measures that one can observe using different test instruments.    

Third, we discuss techniques that have been developed to diagnose threats to the validity of value-added measures.  We evaluate the usefulness of statistical tests, such as the Rothstein “falsification test” and others developed in the econometric literature, and point out their limitations.  We also discuss and develop other types of diagnostics techniques to detect nonrandom teacher assignment and the consequences of applying different types of value-added models to different assignment contexts.

Fourth, we discuss remaining unknowns in value-added research—i.e., problems that researchers have not yet addressed or discussed in depth.  For example, we currently know very little about the degree to which students and parents respond to particular teacher assignments and few models currently in use address these issues. 

In the final section of the paper, we provide practical advice for researchers and policy makers in a series of recommendations for computing and using value-added performance measures going forward.