Evidence on Replication and Robustness

Engel, Mimi; Engel, Mimi

Background

Whether results from a single study are replicable across contexts and over time is a key question related to the scientific method. Further, robustness checks – examining the extent to which findings from a single study are consistent across subgroups and analytic approaches – are an important means for determining whether a quantitative result ‘holds up’ under careful scrutiny or is fragile and, therefore, more likely to be spurious, are more common in some academic disciplines than others. Recent research finds that published studies that emphasize replication are rare in both applied economics and developmental psychology. Researchers analyzed published articles in top field journals in the two disciplines, comparing recent publications with articles published two decades prior. In addition to finding that articles replicating prior research were extremely uncommon in both applied economics and developmental psychology, that study found robustness checking techniques to be much more commonly used in applied economics than in developmental psychology (Duncan, et al., 2014). The purpose of the current study is to replicate and extend that work by examining the extent to which replication and robustness practices are used in public policy and education policy journals

Research Questions

The current study expands upon this prior research to examine how common replication practices and robustness checking efforts are in empirical studies published in peer reviewed public policy and education research journals. We apply the coding scheme developed by Duncan and colleagues (2014) to answer the following research questions:

How often do articles published in five high impact public policy and/or educational research journals include replication of prior research?
How often do articles published in five high impact public policy and/or educational research journals include robustness checks such as the use of multiple data sets, subgroup analyses, and multiple estimation techniques?
How do the results from analyses of public policy and education journals compare with results from journals from economics and developmental psychology?

We code the Journal of Policy Analysis and Management (JPAM), as well as four top field journals in education, American Educational Research Journal (AERJ), and Educational Evaluation and Policy Analysis (EEPA), as well as two newer journals in education; AERA Open, which explicitly encourages submission of replication studies, and the Journal of Research on Educational Effectiveness to explore whether newer journals are more likely to publish replication studies, particularly when their editorial statements encourage their submission.

Following the methods used in related research (Duncan, et al., 2014), the proposed paper will code approximately 400 articles, half of which were published in recent years (2014-2017), and half of which were published approximately two decades ago (1994-1997).

Thus far, we have found that replication of prior research is extremely rare in education, economics, and developmental psychology journals. Additional coding will reveal whether replication is more common in public policy and in newer field journals in education. We find that efforts to check robustness of results are more common in economics than in education or developmental psychology.

Association for Public Policy Analysis & Management

Panel Paper: Evidence on Replication and Robustness