CMU-ISR-17-108
Institute for Software Research
School of Computer Science, Carnegie Mellon University



CMU-ISR-17-108

Detection and Analysis of Online Extremist Communities

Matthew Curran Benigni

May 2017

Ph.D. Thesis
Societal Computing

CMU-ISR-17-108.pdf


Keywords: Covert Network Detection, Community Detection, Annotated Networks, Multilayer Networks, Heterogeneous Networks, Spectral Clustering, Socialbots, Botnets

Online social networks have become a powerful venue for political activism. In many cases large,insular online communities form that have been shown to be powerful diffusion mechanisms of both misinformation and propaganda. In some cases these groups users advocate actions or policies that could be construed as extreme along nearly any distribution of opinion, and are thus called Online Extremist Communities (OECs). Although these communities appear increasingly common, little is known about how these groups form or the methods used to influence them. The work in this thesis provides researchers a methodological framework to study these groups by answering three critical research questions:

  • How can we detect large dynamic online activist or extremist communities?
  • What automated tools are used to build, isolate, and influence these communities?
  • What methods can be used to gain novel insight into large online activist or extremist communities?
These group members social ties can be inferred based on the various affordances offered by OSNs for group curation. By developing heterogeneous, annotated graph representations of user behavior I can efficiently extract online activist discussion cores using an ensemble of unsupervised machine learning methods. I call this technique Ensemble Agreement Clustering. Through manual inspection, these discussion cores can then often be used as training data to detect the larger community. I present a novel supervised learning algorithm called Multiplex Vertex Classification for network bipartition on heterogeneous, annotated graphs. This methodological pipeline has also proven useful for social botnet detection, and a study of large, complex social botnets used for propaganda dissemination is provided as well. Throughout this thesis I provide Twitter case studies including communities focused on the Islamic State of Iraq and al-Sham (ISIS), the ongoing Syrian Revolution, the Euromaidan Movement in Ukraine, as well as the alt-Right.

143 pages

Thesis Committee:
Kathleen M. Carley (Chair)
Zico Kolter
Daniel B. Neill (Heinz)
Randy Garrett

William L. Scherlis, Director, Institute for Software Research
Andrew W. Moore, Dean, School of Computer Science


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu