Font Size: a A A

Object Cosegmentation Based On High-Level Image Semantics

Posted on:2019-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:W H ZuoFull Text:PDF
GTID:2348330542469409Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
The human's visual system can not only do perception on the low-level abstract features such as color,texture,illumination and edge,but also extract the high-level image semantics including the class,size,geometric structure and spatial layout of the objects in 2D images.By contrast,scene structure estimation by computer vision usually uses classic geometry theory and technique of image processing and it can't obtain good performance in conditions such as illumination change,occlusion,large motion of the camera,low texture of image regions and repeated texture structures.We start our research on how to improve the performance of the current high-level semantic reasoning and innovatively apply these semantics to different indoor and outdoor environments.Our research focuses on how to obtain more accurate and robust structure reasoning of the scene through the guidance of the high-level image semantics compared with traditional geometry based methods.The content and contribution of our research are as follows:We propose an unsupervised cosegmentation algorithm which contains multiple foreground objects and severely changed background.This algorithm utilizes the correspondences between the images and the different regions in each image to improve the consistency of the foreground and background models.It doesn't need the constraints in traditional methods such as the large difference in appearance between the foreground and background.Through the experiment,we can see that our algorithm has more robust performance than several classic object cosegmentation algorithms when large changes occurs in both the pose and shape of the objects or the viewpoint of the camera.This paper presents an interactive framework for geometric and semantic annotation of the scene with good flexibility based on traditional feed-forward design of the visual systems.Our framework includes several visual analysis models based on the baseline algorithm and uses the contextual interaction of intrinsic images to improve geometric and semantic reasoning iteratively.Also,the visual analysis models can be interacted with each other to refine themselves iteratively.The experiment demonstrates that our intrinsic image based feedback system design can effectively improve the performance of the baseline algorithm.This paper presents a new algorithm which utilizes the spatial layout of the room and the spatial constraints between objects in the room to recover the 3D structure of the cluttered indoor environments.The volumes of the room and objects are both parameterized and the high-level image semantics are used to provide the prior information of the objects in the room.We also add the volumetric constraints such as spatial exclusion and volume contairnment to better estimate the spatial layout of the clutter rooms and provide diverse description of the objects.Furthermore,these geometric cues can be critical for the following object recognition and the whole scene understanding.This paper uses sparse views to realize 3D reconstruction of the large scene.We show how to combine monocular and geometric cues together to recover the accurate 3D models of the outdoor scenes in wide baseline conditions.Our proposed algorithm uses Markov Random Field model to estimate both the 3D position and orientation of each superpixel in the images and multiple high-level image semantics are integrated to refine the 3D modeling process.Furthermore,a iterative framework is designed to simultaneously optimize estimation of the depth and high-level image semantic gradually.The experiment demonstrates that our method can obtain more robust and accurate performance than traditional methods with a small number of images which have small overlapping between different views.
Keywords/Search Tags:High-level Image Semantics, Scene Structure Reasoning, Interactive Geometric and Semantic Annotation, Object Cosegmentation, Wide-baseline Stereo
PDF Full Text Request
Related items