Panel Paper:
Probing Impact Heterogeneity Using Machine Learning Methods in the Evaluation of Early College High Schools in North Carolina
*Names in bold indicate Presenter
Results from existing analyses show that early colleges included in the study sample have had positive and statistically significant impacts on key predictors of success in college and receipt of a postsecondary credential. Through eight years after 9th grade, treatment students had significantly higher rates of ever enrolling in college (90% treatment vs. 74% control), attainment of any postsecondary credential (37% treatment vs. 22% control), and attainment of a four-year degree (18% treatment vs. 13% control).
There is also some evidence of larger impacts on students who are disadvantaged and underprepared for high school but heterogeneity patterns are not consistent across outcome measures. This proposed paper will apply the machine learning methods suggested for exploring impact heterogeneity such as regression trees and random forest techniques (Wager & Athey, 2015; Athey & Imbens, 2016; Davis & Heller, 2017) to conduct a more systematic examination of variation in impacts in the context of early colleges. The ML methods provide a more flexible framework that searches for heterogeneity over data-driven and high-dimensional functions of baseline covariates that could reveal evidence for impact heterogeneity which may be missed by the conventional method of exploring impact heterogeneity via analyses of subgroups based on observed baseline characteristics. The early college high school impact study is a good candidate for the application of these methods given its fairly large sample and availability of a rich set of baseline covariates including student-level demographic and socioeconomic characteristics, engagement with schooling, and academic achievement.