*Names in bold indicate Presenter
This paper reports on an evaluation where the results from follow-up surveys contradicted the findings from administrative records. It examines how this emerged and describes how the issue was addressed analytically.
The UK ERA demonstration was the most comprehensive social experiment ever conducted in the UK. It examined the extent to which a combination of post-employment advisory support and financial incentives could help lone parents on welfare and long-term unemployed people to find sustained full-time employment. Randomized individuals were tracked for 5 years through administrative data and a random sub-sample of lone parents were surveyed after 1, 2 and 5 years. The survey approach distinguished between lone parents working 16-29 hours per week and those working fewer hours (or not at all). The five-year response rates for the two groups were 69% and 62% respectively.
Non-response did not appear to introduce a systematic difference between the treatment and control groups, despite response rates being somewhat higher for the treatment group than the control group. There were, however, significant differences between respondents and non-respondents, which raises concerns around the representativeness of the respondent sample. Estimated employment impacts for respondents were similar to those for the full sample but, for earnings, estimates based on the respondent sample were much larger. Had the survey been the only source of earnings data, the estimated impacts might have dramatically overstated the effectiveness of the program.
Aside from this headline result, we discuss three other findings. First, employment rates observed in administrative data match fairly well those observed in survey responses. There is agreement in approximately three-quarters of cases. Consistent with administrative data only capturing those earnings above a certain amount, our results show a much lower match rate among survey respondents reporting part-time work.
Second, there are also disparities within the administrative data. Information on employment and information on earnings are drawn from different sources. Using these to compute, respectively, employment in a financial year and positive earnings in a financial year shows a match rate that, while high, (c80%) is not perfect. This raises a concern about the reliability of the administrative data as a measure of outcomes (although they still offer an important means of testing whether survey response biases estimated impacts).
Third, attempts to adjust for attrition in such a way that estimates from survey respondents were brought into line with full sample results were unsuccessful. This may be due to attrition being driven by behavior/outcomes post-randomization.
Our findings highlight the importance of collecting multiple data sources to triangulate results. They also caution that there may not always be a single data source that can be regarded as fully accurate. By probing the differences between estimates based on different data sources, we can feel more confident in our interpretation of overall findings.