Panel Paper: Using Big Data and Social Media to Understand Neighborhood Conditions

Friday, November 8, 2019
I.M Pei Tower: Terrace Level, Columbine (Sheraton Denver Downtown)

*Names in bold indicate Presenter

Constantine E. Kontokosta, New York University, Lance Freeman, Columbia University and Yuan Lai, Massachusetts Institute of Technology


Community development practitioners, local residents, and policy makers have a need for accurate, timely and reliable data on neighborhood conditions. For many years the decennial census was the only data that was accurate, reliable and easily accessible at the neighborhood level. Being only released only once every 10 years, however, limited the usefulness of census data in terms of timeliness. During The first decade of this century increasing amounts data became available at the neighborhood level. Moreover, the Census Bureau began releasing the American Community Survey, which is updated on an annual basis. Municipalities also began making data related to city services (e.g. property sales, crime locations, etc.) accessible on a regular basis. Nongovernmental organizations such as the National Partnership for Neighborhood Indicators also began to organize and help neighborhoods collect, curate and disseminate an array of neighborhood data. In what might be considered a third wave of data accessibility, spawned by the rise big data, the ubiquity of smart phones, and the popularity of social media, new types of data with the potential to help us understand neighborhood level conditions and trends are increasingly becoming available. Scholars have used social media to study sentiment towards public transit (Schweitzer 2014), online listings to analyze rental housing market dynamics (Boeing and Waddell 2017), and twitter to examine traffic incidents (Gu, Qian, and Chen 2016). These studies suggest the new types of data in the form of social media and/or big data have utility for policy making purposes.

For the practitioner with need for data at the neighborhood level, however, it is not clear what type of data might provide the most useful information to understand neighborhood dynamics.

The community development practitioners have yet to systematically explore the potential for these new sources of data to help us understand neighborhood level conditions. This paper is an early attempt to document these novel data sources and describe how they might be used by local residents, community development practitioners and urban planners and other policy makers.

In this paper we survey some of these novel sources of data that appear most promising for documenting neighborhood level conditions with a focus on the data’s practical utility in terms of accessibility, accuracy and reliability. We also present results from a pilot study utilizing data from twitter and comparing those results to data from Zillow to check for accuracy.

The results of this paper will be of interest to policy makerss and practitioners who rely on neighborhood level data in their work.