|
CMU-CS-03-173
Computer Science Department
School of Computer Science, Carnegie Mellon University
CMU-CS-03-173
A Robust Subspace Approach to Extracting
Layers from Image Sequences
Qifa Ke
August 2003
Ph.D. Thesis
CMU-CS-03-173.ps
CMU-CS-03-173.pdf
Keywords: Layer extraction, layered representation, subspace,
clustering, robust, video sementation, video analysis, ego-motion
A layer is a 2D sub-image inside which pixels share common
apparent motion of some 3D scene plane. Representing videos with such
layers has many important applications, such as video compression,
3D scene and motion analysis, object detection and tracking, and
vehicle navigation. Extracting layers from videos involves solving
three subproblems: 1) segment the image into sub-regions (layers);
2) estimate the 2D motion of each layer; and 3) determine the number of
layers. These three subproblems are highly intertwined, making the
layer extraction problem very challenging. Existing approaches to
layer extraction are limited by 1) requiring good initial segmentation,
2) strong assumptions about the scene, 3) unable to fully and
simultaneously utilize the spatial and temporal constraints in video,
and 4) unstable clustering in high dimensional space. This thesis
presents a subspace approach to layer extraction which does not have
the above limitations. We first show that the homographies induced by
the planar patches in the scene form a linear subspace whose dimension
is as low as two or three in many applications. We then formulate the
layer extraction problem as clustering in such low dimensional subspace.
Each layer in the input images will form a well-defined cluster in the
subspace, and a simple mean shift based clustering algorithm can reliably
identify the clusters thus the layers. A proof is presented to show that
the subspace approach is guaranteed to increase significantly the layer
discriminability, due to its ability to simultaneously utilize spatial
and temporal constraints in the video. We present the detailed robust
algorithm for layer extraction using subspace, as well as experimental
results on a variety of real image sequences.
171 pages
|