*Names in bold indicate Presenter
Using simulated data based on estimated inter-correlations and reliability of measures in the Gates Foundation’s Measures of Effective Teaching project, this analysis compares the correct classification rates of and expected differences in long-term teacher value-added among teachers identified as high- or low-performing under these three commonly used approaches. We additionally investigate how changes in component weights and the use of reliability-adjusted performance measures affects the identification of high and low performers. Based on the results of our simulation exercise presented here, we conclude the choice of these approaches is important and can undermine the evaluation system’s objectives in some contexts. Specifically, the numeric approach is the preferred approach among the three common approaches considered and in several circumstances is not statistically distinguishable from the best-case error-minimizing approach that cannot be implemented in practice. In some circumstances, namely when using component weights that are misaligned with the optimal weighting structure or when using reliability-adjusted measures, one or both of the two remaining approaches perform significantly worse than the numeric approach and can render the evaluation system useless.
Full Paper: