Panel Paper:
Not so Conservative after All: Exact Matching and Attenuation Bias in Randomized Experiments
*Names in bold indicate Presenter
When a unique identifier is available, linking can, in some cases, be trivial. However, efforts to link data from an experimental intervention to administrative records that track outcomes of interest often require matching datasets without a common unique identifier. Demographic characteristics like name and date of birth are used to match data sets, but errors in matching are inevitable. When scholars are interested in matching to identify outcomes there is often is no prior about what the match rate should be, rendering it difficult to diagnose match quality. For example, if researchers are interested in evaluating whether an employment program reduces the likelihood of arrest, observations who match to the arrest file are considered “arrested” and observations who do not match are considered “not arrested.” Not limited to arrest, this case is applicable to any context in which there is no prior for how many records should match (i.e. hospital utilization, college matriculation, program completion). In order to minimize errors, researchers will often use “exact matching” (retaining an individual only if their name and date of birth matches exactly in two or more datasets) in order to ensure that speculative matches do not lead to errors in the dataset that will be used to evaluate the intervention.
We argue that this “conservative” approach, while seemingly logical, is not optimal and can lead to attenuated estimates of treatment effects and therefore, Type II errors. For rare outcomes and small sample sizes, exact matching is particularly problematic. How can this be? As it turns out, creating stringent character match requirements minimizes false positive matches but maximizes false negative matches which often results in higher total error and, therefore, more attenuated estimates. By contrast, matches performed using machine-learning algorithms (probabilistic matching) tend to minimize total error by allowing for some flexibility in the match.
In the paper, we derive an analytic result for the consequences of the matching error on treatment effect estimation and then provide simulation results to show how the problem varies across different combinations of relevant inputs: total error rate, base rate, and sample size. We proceed with an empirical example that shows the difference between “conservative” naïve-matching strategies and matching using a machine-learning algorithm. We conclude on an optimistic note by showing that we can mitigate the consequences of attenuated estimates using matches derived using machine-learning.