Predicting Family Homelessness Using Machine Learning
*Names in bold indicate Presenter
To address these challenges, we assemble a novel administrative data set of disadvantaged New Yorkers and predict family shelter stays using a variety of machine learning algorithms and predictors based on benefits history, demographics, shelter histories, housing court interactions such as evictions, and building and neighborhood characteristics. Our models demonstrate considerable predictive accuracy, identifying the riskiest ten percent of actual shelter applicants with 66 percent precision and the riskiest half with 20 percent precision. We find that individual, building, and neighborhood characteristics all help predict family shelter entry and use the best predictors to develop an easily implementable heuristic risk model.
To measure the potential gains from machine learning targeting over family self-assessment, we link our sample to data from Homebase, New York City's primary homelessness prevention program. We compare the shelter risk of individuals seeking services through Homebase to those we can identify in our models. Combining treatment effect parameters estimated from previous work with our prediction results, we simulate the number of people prevented from shelter under the current program design and under an algorithm-directed design. At the same level of outreach, algorithmic models could increase the precision with which Homebase services are provided to at-risk families from 19 to 35 percentage points, averting almost 1,000 shelter entries over a two-year period.