Institute for Software Research
School of Computer Science, Carnegie Mellon University


New Methods for Large-scale Analysis of Socail Identifies and Stereotypes

Kenneth Joseph

June 2016

Ph.D. Thesis (SC)


Keywords: Computational Social Science, Affect Control Theory, Natural Language Processing, Bayesian Networks

Social identities, the labels we use to describe ourselves and others, carry with them stereotypes that have significant impacts on our social lives. Our stereotypes, sometimes without us knowing, guide our decisions on whom to talk to and whom to stay away from, whom to befriend and whom to bully, whom to treat with reverence and whom to view with disgust.

Despite these impacts of identities and stereotypes on our lives, existing methods used to understand them are lacking. In this thesis, I first develop three novel computational tools that further our ability to test and utilize existing social theory on identity and stereotypes. These tools include a method to extract identities from Twitter data, a method to infer affective stereotypes from newspaper data and a method to infer both affective and semantic stereotypes from Twitter data. Case studies using these methods provide insights into Twitter data relevant to the Eric Garner and Michael Brown tragedies and both Twitter and newspaper data from the "Arab Spring".

Results from these case studies motivate the need for not only new methods for existing theory, but new social theory as well. To this end, I develop a new sociotheoretic model of identity labeling - how we choose which label to apply to others in a particular situation. The model combines data, methods and theory from the social sciences and machine learning, providing an important example of the surprisingly rich interconnections between these fields.

183 pages

Kathleen M. Carley (Chair)
Jason Hong
Eric Xing
Lynn Smith-Lovin (Duke University)

William L. Scherlis, Director, Institute for Software Research
Andrew W. Moore, Dean, School of Computer Science

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by