CMU-ML-06-113
Machine Learning Department
School of Computer Science, Carnegie Mellon University



CMU-ML-06-113

Cascading Behavior in Large Blog Graphs
Patterns and a Model

Jure Leskovec, Mary McGlohon, Christos Faloutsos,
Natalie Glance*, Matthew Hurst*

October 2006

CMU-ML-06-113.pdf


Keywords: Blog, cascade, information propagation, information diffusion, power-law

How do blogs cite and influence each other? How do such links evolve? Does the popularity of old blog posts drop exponentially with time? These are some of the questions that we address in this work. Our goal is to build a model that generates realistic cascades, so that it can help us with link prediction and outlier detection.

Blogs (weblogs) have become an important medium of information because of their timely publication, ease of use, and wide availability. In fact, they often make headlines, by discussing and discovering evidence about political events and facts. Often blogs link to one another, creating a publicly available record of how information and influence spreads through an underlying social network. Aggregating links from several blog posts creates a directed graph which we analyze to discover the patterns of information propagation in blogspace, and thereby understand the underlying social network.

Here we report some surprising findings of the blog linking and information propagation structure, after we analyzed one of the largest available datasets, with 45,000 blogs and ~2.2 million blogpostings. Our analysis also sheds light on how rumors, viruses, and ideas propagate over social and computer networks. We also present a simple model that mimics the spread of information on the blogosphere, and produces information cascades very similar to those found in real life.

24 pages

*Neilsen Buzzmetrics, Pittsburgh, PA


SCS Technical Report Collection
School of Computer Science homepage

This page maintained by reports@cs.cmu.edu