Making Rigorous Impact Evidence Reliable for Local Policy Decision-Making: Examining Prediction of Site-Specific Impacts
*Names in bold indicate Presenter
Many evaluations of education interventions and social programs are primarily intended to inform local education and social service office decision makers. Adopting specific curricula or teaching techniques or providing services of a particular type to families served by the TANF and child welfare systems are decisions made by local school administrators and state and local human services officials. When program directors wish to use the results of rigorous multi-site national evaluations of interventions to inform their local decisions, they are faced with a question: “If we were to implement locally the intervention tested in the national evaluation would it produce impacts of the same direction and magnitude…or some other particular effects (or no effect at all)?” This paper investigates this question—dubbed the “transfer validity” issue for impact findings—using data from an actual evaluation.
Recent work on external validity of large-scale impact evaluations has focused on widening the scope of applicability of findings from the evaluation sample to the larger nation. Virtually no research has been conducted on narrowing evaluation results to better transfer their policy guidance to specific local program offices or school districts. Using an example from education—charter school reform—our research identifies and formalizes methods for using evidence from large-scale randomized evaluations to predict the effects of implementing an intervention in individual locations. We investigate and report whether particular statistical methods, combined with existing data from the national evaluation of charter schools sponsored by the Institute of Education Sciences are successful at predicting site-specific impacts.
To do this, we imagine that one of the sites in the charter school study had not been included in the evaluation and try to extrapolate (“transfer”) the impact estimate for that site using the experimental evidence of impacts in all the other sites. We then compare the predicted impact for that site to the gold standard estimate obtained by utilizing the site’s actual data and the within-site randomized design. Repeating this exercise one site at a time gives a large set of “case study” findings on transfer validity—i.e., the transportability of impact evidence to local jurisdictions. The methods examined include reweighing approaches and outcome model-based strategies. Methods that show consistent success in this exercise can serve as a means of predicting impacts in other non-studied sites.
This research addresses the oft-heard question for policy makers in evaluation sites when viewing the national findings, “Did it work for my setting?” and the frequent lament of policy makes outside the study sample of “How can I know if it would work in my setting?” Moreover, while the same extrapolation strategies may not work best in other contexts like foster care reform or job training services, future multi-site national evaluations can each follow the same analytic path pioneered here to identify the most reliable “impact transfer” method for its circumstances.