Computer Science Department
School of Computer Science, Carnegie Mellon University


Reconstruction and Applications of
Collective Storylines from Web Photos Collections

Gunhee Kim

September 2013

Ph.D. Thesis


Keywords: Computer Vision, Machine Learning, Optimization

Widespread access to photo-taking devices and high speed Internet has combined with rampant social networking to produce an explosion in picture sharing on Web platforms. In this environment, new challenges in image acquisition, processing, and sharing have emerged, creating exciting opportunities for research in computer vision and multimedia data mining. In this dissertation, we explore one of these interesting problems, the reconstruction of collective storylines as an efficient but comprehensive structural summary of ever-growing big image data shared online.

More specifically, the goal of this dissertation can be summarized as follows. Given large-scale online image collections and associated meta-data, we aim to create the collective storylines by jointly inferring the temporal trends and the overlapping contents of image collections. We also explore novel computer vision and data mining applications taking advantage of the reconstructed photo storylines.

In order to achieve the proposed research objective, we develop the required technologies from three research directions, which are (1) understanding of temporal trends of image collections, (2) discovery of overlapping contents across image collections, and (3) reconstruction and applications of collective photo storylines. The first direction of the work addresses the problems of understanding what topics are popular when by whom in the image collections, while the second line of the work studies the approaches for detecting salient and recurring contents across the image collections in the form of bounding boxes or pixel-wise segmentations. Finally, based upon the results of the work in the first two directions, we propose the reconstruction algorithms of branching storyline graphs, and explore their promising applications at the intersection of computer vision and multimedia data mining.

171 pages

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by