Panel Paper:
Estimation in the Presence of Selection on Unobservables: An Approach Using Econometric and Machine Learning Methods
Monday, July 29, 2019
40.047C - Level 0 (Universitat Pompeu Fabra)
*Names in bold indicate Presenter
Missing data poses a fundamental challenge for causal inference when the mechanisms that drive missingness are not directly observable to the analyst. This paper develops a new econometric method for accurately estimating unobserved individual outcomes in non-random samples. We contribute to the causal inference literature by integrating new machine learning techniques into standard causal inference methods. We model the decision-making process of "agents" (e.g. doctors or judges) that results in outcomes being observed for only a selected sample of the population. We leverage exogenous factors that influence the intensity of the selection to identify unobservable outcomes. We build on the model developed in Burger and McLaren (Health Economics 2017) which identifies the selection mechanism through a combination of instrumental variables, structural modeling and weak distributional assumptions on the error term. This paper integrates machine learning methods (i.e. LASSO) into traditional econometric approaches to build a hybrid human-machine learning algorithm that extracts more information from observed data than standard econometric approaches alone and quantifies biases in "agent" decision-making. We implement what the machine learning literature calls "honest" estimation and inference which uses out-of-sample validation to avoid data mining and overfitting while producing asymptotically valid standard errors. We develop two applications that address important questions in the health policy and criminal justice literatures to demonstrate the broad applicability of our method and its real-world value for policy decision-making.