Panel Paper: Evaluation of the i3 Scale-up of Reading Recovery

Thursday, November 12, 2015 : 11:15 AM
Henry May, University of Delaware, Phil Sirinides, University of Pennsylvania and Abby Gray, University of Pennsylvania, Consortium for Policy Research and Evaluation
While prior research on Reading Recovery shows that the program’s impacts on student achievement are often large, research also suggests that there is substantial variability in impacts, and that much of this can be attributed to variation in program implementation (D’Agostino & Murphy, 2004; May, Gray, Gillespie, Sirinides, Sam, Goldsworthy, Armijo, & Tognatta, 2013; Schwartz, 2005; Rodgers et al, 2004 and 2005). The evaluation design for the i3 scale-up of Reading Recovery includes a rigorous experimental research design (i.e., a multi-site randomized controlled trial) that supports strong causal inferences about program impacts, coupled with mixed methods descriptions of program implementation and contextual factors. Short-term impacts on students’ reading performance are estimated by comparing mid-year reading achievement of students randomly assigned to participate in reading recovery at the beginning of first grade to students randomly assigned to the control condition. To estimate the sustained effects of Reading Recovery, this final year of the evaluation employs a Regression Discontinuity Design (RDD) to compare the 3rdgrade reading achievement of students who were just above and below the cutoff for eligibility to participate in Reading Recovery.

Results from the first three years of the randomized controlled trial (RCT) have revealed large positive effects on reading achievement (May, Gray, Gillespie, Sirinides, Sam, Goldsworthy, Armijo, & Tognatta, 2013; May, Sirinides, Gray, Armijo, Gillespie, Goldsworthy, Sam, & Blalock, 2014). Over this five-year study, more than 7,000 students in over 1,000 schools were randomly assigned to treatment and control conditions. Baseline equivalence was confirmed for gender, race, English Language Learner status, and prior text reading level. HLM analyses of data from the RCT revealed significant overall effects of Reading Recovery between .40 and .60 standard deviations on ITBS Reading Test scores. Results from preliminary analyses of the RDD data show that this non-experimental design was able to replicate the results of the RCT for impacts during first grade. Additional significant effects were found for the variance of treatment effects across schools, translating to 90% plausible value intervals that were plus/minus over one-half of a standard deviation. This suggests that the short-term impacts of Reading Recovery are positive and relatively large on average, although some schools exhibit unusually small or large impact estimates. Contextual data from the implementation evaluation help to explain this variation in effects. Additional analyses of long-term effects are currently underway and will also be included in the presentation at APPAM.

The consistently large positive impacts of Reading Recovery under the i3 scale-up suggest that this relatively large investment has led to substantial improvements in the reading performance of many thousands of students across the nation. It also serves as a point of validation for the Investing in Innovation (i3) program model—the size of an investment in an educational intervention should be proportionate to its prior evidence of effects.