|
CMU-CS-97-111
Computer Science Department
School of Computer Science, Carnegie Mellon University
CMU-CS-97-111
Video Skimming and Characterization through the Combination of Image
and Language Understanding Techniques
Michael A. Smith, Takeo Kanade
February 1997
CMU-CS-97-111.ps
CMU-CS-97-111A.ps
CMU-CS-97-111.ps.gz
CMU-CS-97-111A.ps.gz
Keywords: Video skimming, audio skim, image skim, keyphrases,
characterization, integrated technology, video compaction
Digital video is rapidly becoming important for education, entertainment,
and a host of multimedia applications. With the size of the video
collections growing to thousands of hours, technology is needed to
effectively browse segments in a short time without losing the content of
the video. We propose a method to extract the significant audio and video
information and create a "skim" video which represents a very short
synopsis of the original. The goal of this work is to show the utility of
integrating language and image understanding techniques for video skimming
by extraction of significant information, such as specific objects, audio
keywords and relevant video structure. The resulting skim video is much
shorter, where compaction is as high as 20:1, and yet retains the essential
content of the original segment.
14 pages
|