Poster Paper:
Understanding Computation of TOT Estimator Standard Errors in Program Evaluation
Thursday, November 12, 2015
Riverfront South/Central (Hyatt Regency Miami)
*Names in bold indicate Presenter
In a major national training program evaluation--the DOL's Green Jobs-Health Care (GJ-HC) program evaluation--as the analysis team planned for computing treatment-on-the-treated (TOT) estimates as well as intent-to-treat (ITT) estimates, we considered using instrumental variables (IV) estimation as well as the so-called “Bloom adjustment” to make the conversion. In theory the two approaches produce the same standard errors, but in practice that may not always be the case (especially in small samples) because of error in the first stage, identifying program enrollees and no-shows. Further, in practice, this implies that the IV-generated standard errors should be larger in magnitude than what is produced by the simple “division” approach. However, preliminary analyses and simulations show that the IV approach produced very slightly smaller standard errors in the GJ-HC data, which is the opposite of what one would expect. This paper will examine the issue more fully to clarify implications for the program evaluation field. Noncompliance with assigned treatment status is common in randomized trials. Noncompliance complicates the identification of causal effects because units select whether to adhere to treatment status, and this selection to treatment exposure status is not as-if random. Instrumental variables (IV) provides a coherent framework for understanding the assumptions that are needed to identify causal effects when noncompliance occurs. Under the IV framework, estimation is typically done using the Wald estimator. The standard error for the Wald estimator requires either exact methods or an asymptotic approximation, where the approximation is typically based on the Delta method. This paper will explore the accuracy of an alternative approximation for the standard error of the Wald estimator that is widely used in the evaluation literature. We will evaluate this method using both simulations and an empirical application, using data from the GJ-HC training program evaluation to compare different methods of inference. We think this topic would be of interest to the APPAM audience because of its general interest in estimating causal effects and program evaluation methodology.