Machine Learning Department
School of Computer Science, Carnegie Mellon University


Dynamics of Large Networks

Jurij Leskovec

September 2008

Ph.D. Thesis


Keywords: Network analysis, graph mining, complex networks, network evolution, small-world phenomena, densification, Kronecker graphs, information propagation, diffusion, cascades, viral marketing, outbreak detection, graph partitioning, network community structure

A basic premise behind the study of large networks is that interaction leads to complex collective behavior. In our work we found very interesting and counterintuitive patterns for time evolving networks, which change some of the basic assumptions that were made in the past. We then develop models that explain processes which govern the network evolution, fit such models to real networks, and use them to generate realistic graphs or give formal explanations about their properties. In addition, our work has a wide range of applications: it can help us spot anomalous graphs and outliers, forecast future graph structure and run simulations of network evolution.

Another important aspect of our research is the study of "local" patterns and structures of propagation in networks. We aim to identify building blocks of the networks and find the patterns of influence that these blocks have on information or virus propagation over the network. Our recent work included the study of the spread of influence in a large personto-person product recommendation network and its effect on purchases. We also model the propagation of information on the blogosphere, and propose algorithms to efficiently find influential nodes in the network.

A central topic of our thesis is also the analysis of large datasets as certain network properties only emerge and thus become visible when dealing with lots of data. We analyze the world's largest social and communication network of Microsoft Instant Messenger with 240 million people and 255 billion conversations. We also made interesting and counterintuitive observations about network community structure that suggest that only small network clusters exist, and that they merge and vanish as they grow.

388 pages

SCS Technical Report Collection
School of Computer Science homepage

This page maintained by