Computer Science Department
School of Computer Science, Carnegie Mellon University


Understanding and Managing Propagation on
Large Networks – Theory, Algorithms, and Models

B. Aditya Prakash

September 2012

Ph.D. Thesis


Keywords: Data mining, graph mining, time series analysis, arbitrary networks, virus propagation, cascades, viral marketing, contagion, memes, immunization, culprits, epidemic threshold, tipping-points, winner-takes-all, co-existence, eigen-drop, information diffusion models, competing species, epidemiology, outliers, eigenvalues, NetShield, NetSleuth, Smart-Alloc, SpikeM, CTM

How do contagions spread in population networks? What happens if the networks change with time? Which hospitals should we give vaccines to, for maximum effect? How to detect sources of rumors on Twitter/Facebook? These questions and many others such as which group should we market to, for maximizing product penetration, how quickly news travels in online media and how the relative frequencies of competing tasks evolve are all related to propagation/cascade-like phenomena on networks.

In this thesis, we present novel theory, algorithms and models for propagation processes on large static and dynamic networks, focusing on:

1. Theory: We tackle several fundamental questions like determining if there will be an epidemic, given the underlying networks and virus propagation models and predicting who-wins when viruses (or memes or products etc.) compete. We give a unifying answer for the threshold based on eigenvalues, and prove the surprising winner-takes-all’ result and other subtle phase-transitions for competition among viruses.
2. Algorithms: Based on our analysis, we give dramatically better algorithms for important tasks like effective immunization and reliably detecting culprits of epidemics. Thanks to our carefully designed algorithms, we achieve 6x fewer infections on real hospital patient-transfer graphs while also being significantly faster than other competitors (upto 30,000x).
3. Models: Finally using our insights, we study numerous datasets to develop powerful general models for information diffusion and competing species in a variety of situations. Our models unify earlier patterns and results, yet being succinct and enable challenging tasks like trend forecasting, spotting outliers and answering ‘what-if’ questions.

Our inter-disciplinary approach has led to many discoveries in this thesis, with broad applications spanning areas like public health, social media, product marketing and networking. We are arguably the first to present a systematic study of propagation and immunization of single as well as multiple viruses on arbitrary, real and time-varying networks as the vast majority of the literature focuses on structured topologies, cliques, and related un-realistic models.

231 pages

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by