Panel Paper:
Asymdystopia: The Threat of Small Biases in Evaluations of Education Interventions That Need to be Powered to Detect Small Impacts
*Names in bold indicate Presenter
This paper examines the potential for small biases to increase the risk of making false inferences as studies are powered to detect smaller impacts, a phenomenon we refer to as asymdystopia. We examine this potential for two of the most rigorous designs commonly used in evaluation research.
First, we consider the role of attrition bias in randomized controlled trials (RCTs) as studies are powered to detect smaller impacts. To do so, we use an attrition model for RCTs used in several federal evidence reviews, including the What Works Clearinghouse (WWC 2013; 2014). Using this model and data from the WWC on attrition from more than 800 prior studies, we show how attrition may become less acceptable, leading to higher rates of false inferences, as studies are powered to detect smaller effects. We also use the WWC data to consider the feasibility of achieving lower attrition rates in studies that are powered to detect small impacts.
Second, we examine functional form misspecification bias in Regression Discontinuity Designs (RDDs) as studies are powered to detect smaller impacts. To do so, we use Monte Carlo simulations to assess what happens as the sample size of the RDD increases under varying assumptions regarding the true functional form. The data generating processes used for these simulations are based on data from several prior large-scale RCTs in education (James-Burdumy et al. 2010; Constantine et al. 2009; Campuzano et al. 2009). Specifically, we examine the effect of a larger sample size on statistical power, functional form misspecification bias, and the accuracy of estimated p-values. Our simulation findings show that with conventional estimation, Type 1 error rates go up as studies are powered to detect smaller impacts, but that the robust estimation approach that Calonico et al. (2014) recommend solves this problem.
Overall, our findings suggest that biases that might have once been reasonably ignorable can pose a real threat in evaluations that are powered to detect small impacts. This paper identifies and quantifies some of these biases, and shows that they are important to consider when designing evaluations and when analyzing and interpreting evaluation findings. We also discuss potential strategies to address these biases. Our findings should not be interpreted as suggesting that researchers should avoid powering evaluations to detect small impacts. The problem of small biases is real but surmountable—so long as it is not ignored.