Computer Science Department
School of Computer Science, Carnegie Mellon University


Automatic Modeling and Localization for Object Recognition

Mark Damon Wheeler

October 1996

Ph.D. Thesis

Keywords: Localization, pose refinement, automatic object modeling, recognition, robust statistics, consensus, M-estimation, surface merging

Being able to accurately estimate an object's pose (location) in an image is important for practical implementations and applications of object recognition. Recognition algorithms often trade off accuracy of the pose estimate for efficiency---usually resulting in brittle and inaccurate recognition. One solution is object localization---a local search for the object's true pose given a rough initial estimate of the pose. Localization is made difficult by the unfavorable characteristics (for example, noise, clutter, occlusion and missing data) of real images.

In this thesis, we present novel algorithms for localizing 3D objects in 3D range-image data (3D-3D localization) and for localizing 3D objects in 2D intensity-image data (3D-2D localization). Our localization algorithms utilize robust statistical techniques to reduce the sensitivity of the algorithms to the noise, clutter, missing data, and occlusion which are common in real images. Our localization results demonstrate that our algorithms can accurately determine the pose in noisy, cluttered images despite significant errors in the initial pose estimate.

Acquiring accurate object models that facilitate localization is also of great practical importance for object recognition. In the past, models for recognition and localization were typically created by hand using computer-aided design (CAD) tools. Manual modeling suffers from expense and accuracy limitations. In this thesis, we present novel algorithms to automatically construct object-localization models from many images of the object. We present a consensus-search approach to determine which parts of the image justifiably constitute inclusion in the model. Using this approach, our modeling algorithms are relatively insensitive to the imperfections and noise typical of real image data. Our results demonstrate that our modeling algorithms can construct very accurate geometric models from rather noisy input data.

Our robust algorithms for modeling and localization in many ways unify the treatment of these problems in the range image and intensity image domains. The modeling and localization framework presented in this thesis provides a sound basis for building reliable object-recognition systems.

We have analyzed the performance of our modeling and localization algorithms on a wide variety of objects. Our results demonstrate that that our algorithms improve upon previous approaches in terms of accuracy and reduced sensitivity to the typical imperfections of real image data.

246 pages

Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by