Panel Paper: Cross-Site Impact Variation: How Much Is There?

Thursday, November 3, 2016 : 8:35 AM
Columbia 12 (Washington Hilton)

*Names in bold indicate Presenter

Natalya Verbitsky-Savitz1, Michael Weiss2, Howard Bloom2, Dan Cullinan2, Himani Gupta2 and Alma Vigil1, (1)Mathematica Policy Research, (2)MDRC


In addition to substantive considerations related to cross-site variation in program effectiveness, there are methodological implications of program effect variation – how much program effects vary influences statistical power, and therefore study design.  For example, statistical power calculations for multi-site individual random assignment evaluations require evaluators to provide their best guess of how much program effects will vary across sites.  The more program effects vary across sites, the larger the study must be in order to attain a given level of statistical power, all else being equal.

Relatedly, how much program effects vary is connected to the generalizability of experimental evaluations, which typically include convenience samples of sites.  If program effects were always homogenous across sites, generalizability would be a non-issue.  However, if program effects vary substantially across sites, then the particular convenience samples of sites may notyield effect estimates that are broadly applicable to the desired inference populations.

As we begin to seriously consider variation in program effectiveness, it is important to start by getting a handle on just how much program effects tend to vary.  In this project we explore variation in program effects across sites, where sites are typically schools.  That is, we ask and estimate “how much do average program effects vary across schools or after-school programs or pre-school programs?”  We examine this question through data from approximately 10 large extant multi-site individually randomized experiments that span from pre-school (e.g. the national Head Start Impact Study), to elementary school (e.g., Reading Recovery), to Middle School (e.g., Charter Middle Schools), to High School (e.g., Small Schools of Choice), and through postsecondary education (e.g., performance-based scholarships).  Outcomes include achievement scores, credit accumulation, and degree completion.

 For each study, we will present estimates of the mean of the distribution of site treatment effect sizes (Beta) and the standard deviation of the distribution of site treatment effect sizes (Tau) for key outcomes. Across studies, we find that Tau varies from nearly zero (little evidence that effects vary across sites) to quite large (tau is estimated to be greater than .30).