Inference Machines Parsing Scenes via Iterated Predictions

Posted on:2014-07-29

Degree:Ph.D

Type:Dissertation

University:Carnegie Mellon University

Candidate:Munoz, Daniel

Full Text:PDF

GTID:1458390008956054

Subject:Engineering

Abstract/Summary:

Extracting a rich representation of an environment from visual sensor readings can benefit many tasks in robotics, e.g., path planning, mapping, and object manipulation. While important progress has been made, it remains a difficult problem to effectively parse entire scenes, i.e., to recognize semantic objects, man-made structures, and land-forms. This process requires not only recognizing individual entities but also understanding the contextual relations among them.;The prevalent approach to encode such relationships is to use a joint probabilistic or energy-based model which enables one to naturally write down these interactions. Unfortunately, performing exact inference over these expressive models is often intractable and instead we can only approximate the solutions. While there exists a set of sophisticated approximate inference techniques to choose from, the combination of learning and approximate inference for these expressive models is still poorly understood in theory and limited in practice. Furthermore, using approximate inference on any learned model often leads to suboptimal predictions due to the inherent approximations.;As we ultimately care about predicting the correct labeling of a scene, and not necessarily learning a joint model of the data, this work proposes to instead view the approximate inference process as a modular procedure that is directly trained in order to produce a correct labeling of the scene. Inspired by early hierarchical models in the computer vision literature for scene parsing, the proposed inference procedure is structured to incorporate both feature descriptors and contextual cues computed at multiple resolutions within the scene. We demonstrate that this inference machine framework for parsing scenes via iterated predictions offers the best of both worlds: state-of-the-art classification accuracy and computational efficiency when processing images and/or unorganized 3-D point clouds. Additionally, we address critical problems that arise in practice when parsing scenes on board real-world systems: integrating data from multiple sensor modalities and efficiently processing data that is continuously streaming from the sensors.

Keywords/Search Tags:

Inference, Parsing scenes

Related items

1	Parsing Natural Scenes Based On Hierarchical Region Merge
2	Scenes Parsing Research
3	Exploiting Dependency Parsing As An Auxiliary Task To Enhance AMR Parsing
4	The Elimination Of Inference Channel Based On Rough Set Theory
5	Design And Construction Of Distributed JS Parsing System
6	Research Of Face Parsing Based On Convolutional Neural Network
7	Research Of Chinese Stentence Skeleton Parsing Base On Statistical Model
8	Research On Inference Control Technologies In Databases
9	Research And Implement On Chinese Dependency Parsing
10	Research And Application Of Uncertain Inference In Expert Systems