Panel Paper: Conducting an at-Scale Policy Experiment for a National Population: What Is to be Gained or Lost Relative to an Experiment Using Informed Volunteers? Evidence from the Social Security Administration’s Benefit Offset National Demonstration (BOND)

Thursday, November 8, 2018
Lincoln 3 - Exhibit Level (Marriott Wardman Park)

*Names in bold indicate Presenter

David Stapleton, Mathematica Policy Research and Stephen Bell, Westat


The Benefit Offset National Demonstration (BOND) tested a $1 reduction in Social Security Disability Insurance (SSDI) benefits for every $2 in earnings above the level where benefits currently drop from full benefits to zero. SSA conducted a “population” experiment: it applied the offset to a nationwide, randomly selected sample of all SSDI beneficiaries. Most other social demonstrations randomly assign informed volunteers. SSA also conducted a parallel “volunteer” experiment to 1) learn as much as feasible about impacts on those most likely to exploit the benefit offset, and 2) conduct a test enhancements to benefits counseling. This pair of experiments provides a chance to assess relative strengths and weaknesses of population and volunteer experiments.

Population experiments are rare due to the potential for harm to treatment subjects. It was practical for BOND because the rule changes, by nature, were essentially guaranteed to not harm treatment subjects. In the face of strong ethical objections to a population experiment, a volunteer experiment may be the only choice. Even so, designers of volunteer experiments seeking to maximize their value need to both understand what they must give up to be ethical, as well as what they might gain.

The presumed advantage of population relative to volunteer experiments is that impact estimates are unbiased estimates of what impacts would be if the same rules were applied to the entire population; it is not necessary to infer impacts based on self-selected volunteers. The population experiment has relative disadvantages, however. Because the new rules are only salient to a small share of the population, power considerations require samples sizes for population experiments that are much larger than for volunteer experiments. Dispersion of the experimental samples around the country, to ensure national representation, adds more to the cost of implement for a population sample than for a volunteer sample, possibly jeopardizing the integrity of implementation. The smaller samples for volunteer experiments have two other advantages relative to the larger samples for population experiment, concerning: the feasibility of conducting baseline and follow-up surveys; and the feasibility of testing variants to the primary intervention.

Projection of impacts from volunteers to impacts for a national population is problematic because the new rules might be salient to the behavior of many in the recruitment pool who did not volunteer. We analyze the extent to which offset use by the volunteers and impacts observed for the volunteer treatment group can respectively account for offset use and impacts observed for the population treatment group. Another important consideration concerns the nature of outreach. Outreach for a volunteer experiment must be of sufficient intensity for volunteers to attest to a basic understanding of the experiment before they volunteer, possibly exceeding the outreach required under a national program and potentially contributing to bias. Outreach to a population treatment group is necessary, but need not be adequate to support informed consent. The BOND results demonstrate how critical outreach is to the design and interpretation of both volunteer and population experiments.