Indoor Scene Understanding Based On Convolutional Neural Network And 3D Geometric Context Information

Posted on:2019-06-27

Degree:Master

Type:Thesis

Country:China

Candidate:X M Zhang

Full Text:PDF

GTID:2348330542475006

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

As one of the most important research in computer vision,scene understanding has been widely used in many fields,such as mobile robot,image information retrieval,intelligent monitoring,smart home and so on.Thus there is very important research significance and application value with it.Compared with the outdoor scene,the research on indoor scene has made slow progress and faced more challenge,due to its complex structure and a wide range of objects with it.Focusing on indoor scene,this paper understands an image from the following three aspects,the scene classification,object detection and the spatial layout estimation.Instead of simply carrying out three independent tasks separately,this paper make use of the combination of tis three tasks because they are interconnected and are able to provide information with each other.Through a large number of experiments and comparison,the effectiveness of methods in this paper is proved.The main contributions of this paper are as follows:(1)Based on convolutional neural network,a scene classification method combined semantic context information is proposed.For scene level task,it requires lots of train data for its huge diversity.The use of image semantic context information that is the co-occurrence of a particular scene type and an object can convert a scene level task into object level,which reduces the dependence on massive train data.To replace traditional artificial features,it use convolutional neural network to extract feature automatically.Then combine it with semantic context to obtain scene categories.The experimental results show that the classification method performs well even on small train set.(2)On the basis of traditional layout estimation methods which are based on geometric context,this paper take information edge maps into account.Take the information edge obtained from fully convolutional networks as a priori condition to finely divide the scene layout.It reduces the dependence on the sampling frequency to improve the sampling precision,and avoids the influence of "clutter" edges at the same time.It can not only reduce the occlusion problem,improve the accuracy of the layout estimation and reduce the workload of ranking the layout candidates.(3)In addition to using the information in 2D space mentioned above,this paper makes full use of geometric and semantic information in 3D space with constructing 3D spatial structure model.Information in 3D space helps to reduce the influence of occlusion and change of viewpoint,which can effectively improve the effect of scene understanding.In this paper,we introduce semantic context,geometric context and three-dimensional spatial geometry model into the scene understanding task.Experimental results verify the effectiveness of the methods on all the three aspects:scene classification,target detection and layout estimation.

Keywords/Search Tags:

indoor scene understanding, convolutional neural network, scene classification, object detection, spatial layout estimation

PDF Full Text Request

Related items

1	Spatial Layout Estimation Of Indoor Scene Using Informative Edges And Multi-modality Features
2	Research On 3D Indoor Scene Technology For Video Sequences
3	A Coarse-to-fine Estimation Of Spatial Layout Of Indoor Scenes
4	Research On Outdoor Scene Understanding Using Deep Convolutional Neural Networks
5	Research On Scene Understanding Technology Of Indoor Service Robot Based On Deep Convolution Neural Networks
6	Key Technologies Research On Indoor Scene Recognition Of Mobile Service Robot
7	Vision-Based Layout Estimation In Indoor Scenes
8	Object Classification And Detection Based On Attention Mechanism And Knowledge Distillation
9	The Research Of Scene Understanding Neural Network Model
10	Data-driven Indoor Scene 3D Reconstruction And Semantic Understanding