Panel Paper: Modernizing Gold Standard Records (GSR) for event coding of protest data in English and Spanish

Friday, November 3, 2017
Dusable (Hyatt Regency Chicago)

*Names in bold indicate Presenter

Javier Osorio1, Viveca Pavon2 and Jennifer S. Holmes2, (1)John Jay College, (2)University of Texas, Dallas


This paper advances a novel methodology for generating Gold Standard Records (GSR) for event data. We present new methodological and technological developments for modernizing the generation of GSR for event data by integrating recent techniques of natural language processing and traditional manual coding of event data. We apply our methodology to protest events from English and Spanish text.

An emerging trend of computational linguistics has developed applications to study political behavior with an emphasis on protest movements. The performance metrics of these applications are generally compared against ad-hoc or small GSR that suffer problems of questionable validity, narrow scope, lack of transparency, and have limited or null replicability. Without an adequate set of conceptual standards and methodological procedures to assess and generate reliable, transparent, and replicable GSR, the performance metrics of emerging computerized event coding protocols remain questionable. The discrepancy between these increasingly sophisticated techniques and questionable GSR does not contribute to the ongoing efforts of the Data Access and Research Transparency (DA-RT) initiative to increase the standards in the discipline.

We generate GSR for event data on social protest extracted from text written in English and Spanish. The corpus comes from a vast collection of news reports focused on social protest derived from Giga Word in English and Spanish, thus allowing public access for the easy and transparent replication of the coding protocol. The technology for enhanced human coding is based on BART NLP, and allows generating verifiable event coding and facilitates producing metrics of inter-coder reliability. The resulting GSR are then used as a baseline category for evaluating and comparing the performance of natural language processing techniques such as Named Entity Recognition, Part of Speech tagging, and ultimately event coding of protest data in English and Spanish using Petrarch 2 and Eventus ID. Results provide greater transparency about the validity of GSR on protest data in different languages, and allow identifying the strengths and limitations of distinct methodological approaches for computerized event data generation.