Towards a Replication Framework
(Methods and Tools of Analysis)
*Names in bold indicate Presenter
Recent efforts to promote evidence-based practices in the social sciences (e.g., What Works Clearinghouse for education) assume that scientific findings are of sufficient validity to warrant their use in decision making, often with regard to public policy. Replication and reproducibility have long been cornerstones for establishing trustworthy scientific results. At their core is the belief that scientific knowledge should not be based on chance occurrences. Rather, scientific knowledge should be established through systematic, transparent, and reproducible methods, results that are independently verified and replicated, and findings that are generalizable to a target population of interest.
Given the central role of replication and reproducibility in the accumulation of scientific knowledge, recent methodological work has examined both the prevalence and success of replicating seemingly well-established findings. Thus far, results from these replication efforts have not been promising. The Open Science Collaboration (OSC) conducted replications of 100 experimental and correlational studies published in high impact psychology journals. Overall, the OSC found that only 36% of the replication studies produced results with the same statistical significance pattern as the original study. These findings prompted the OSC authors to conclude that replicability rates in psychology were low, but not inconsistent with what has been found in other domains of science. In 2005, Ioannidis argued that most findings published in the biomedical sciences were likely false (2005). His review of more than 1,000 medical publications found that only 44% of replication studies produced results that corresponded with the original findings (2008). Combined, these results contribute to a growing sense of a “replication crisis” across a range of scientific fields and disciplines.
Considerable disagreement remains about what replication is, its role in science, and whether institutional supports are needed to promote replication efforts. This symposium addresses these issues. The first paper presents a formal definition of the replication design using a potential outcomes framework. It describes five stringent assumptions required for two studies to yield identical results (within the limits of sampling error), and demonstrates the advantages and limitations of different replication design variants for addressing these assumptions. The second paper examines the prevalence of published replication efforts across economics, developmental psychology, public policy, and educational policy journals. In addition to examining the how often replication studies are published in high impact journals, the authors report the percentage of articles that examine the robustness of treatment effects across datasets, methods, and subgroups in eight high impact field journals for articles published recently (2014 forward) as well as articles published two decades ago. The third paper provides guidance for promoting a transparent, reproducible workflow in education evaluation and policy research. The framework recommends: 1) pre-registration in the design phase, 2) a data management plan in the analysis phase, 3) a clear reporting protocol in the dissemination phase, and 4) a system for archiving data and analysis code. The discussant for this panel is a statistical methodologist who will comment on the set of papers and also address recent efforts to promote data and methods transparency.