Panel Paper:
Estimating Labor Market Returns to an Early Childhood Intervention: A Comparison of Survey, Administrative, Matched, and Imputed Data
*Names in bold indicate Presenter
The CLS collected self-report income information at age 33-36 and also acquired administrative records on earnings for the same period. Preliminary findings indicate that self-reported income is 20 percent higher than the income from administrative records. Our adjusted models show that earnings of the preschool group are 13 percent higher than earnings of the comparison group. Results for non-imputed values were very similar than to those using multiple imputation (13%-16%), and lower than the ones that used regression analysis or single imputation (20%-21%). All these results used a Heckman-correction for censoring. We found that censoring correction is important to use because we only observe the earnings of those employed who are in the labor force. Moreover, when we do not use censoring correction, the estimated effects of the program are 10 percentage points higher in comparison to those that use a censoring correction. We also tested the sensitivity of our results to selection, samples (administrative, survey or combined) and used different predictors for imputation. We also pay close attention to the earnings of the formerly incarcerated in light of the new results in Schanzenbach et al (2016).
This study aims to promote discussion among policy researchers on challenges of estimating earnings with missing data and when only one source of data is available (self-report or administrative records). It also provides a practical guidance on basic questions like “when to impute missing values for income?”, “what type of information do we need to have?”, “what type of imputation technique is more appropriate to use?”, “how sensitive are our results to censoring correction?” We propose strategies toward a more rigorous evaluation of earnings with missing values.