Computer Science Department
School of Computer Science, Carnegie Mellon University


User Behavior Modeling with Large-Scale Graph Analysis

Alex Beutel

May 2016

Ph.D. Thesis


Keywords: Behavior modeling, graph mining, scalable machine learning, fraud detection, recommendation systems, clustering, co-clustering, factorization, machine learning systems, distributed systems

Can we model how fraudsters work to distinguish them from normal users? Can we predict not just which movie a person will like, but also why? How can we find when a student will become confused or where patients in a hospital system are getting infected? How can we effectively model large attributed graphs of complex interactions?

In this dissertation we understand user behavior through modeling graphs. Online, users interact not just with each other in social networks, but also with the world around them–supporting politicians, watching movies, buying clothing, searching for restaurants and finding doctors. These interactions often include insightful contextual information as attributes, such as the time of the interaction and ratings or reviews about the interaction. The breadth of interactions and contextual information being stored presents a new frontier for graph modeling.

To improve our modeling of user behavior, we focus on three broad challenges: (1) modeling abnormal behavior, (2) modeling normal behavior and (3) scaling machine learning. To more effectively model and detect abnormal behavior, we model how fraudsters work, catching previously undetected fraud on Facebook, Twitter, and TencentWeibo and improving classification accuracy by up to 68%. By designing flexible and interpretable models of normal behavior, we can predict why you will like a particular movie. Last, we scale modeling of large hypergraphs by designing machine learning systems that scale to hundreds of gigabytes of data, billions of parameters, and are 26 times faster than previous methods. This dissertation provides a foundation for making graph modeling useful for many other applications as well as offers new directions for designing more powerful and flexible models.

275 pages

Thesis Committee:
Christos Faloutsos (Co-Chair)
Alexander J. Smola (Co-Chair)
Geoffrey J. Gordon
Phillip S. Yu (University of Illinoise at Chicago)

Frank Pfenning, Head, Computer Science Department
Andrew W. Moore, Dean, School of Computer Science

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by