Font Size: a A A

3D Object Detection From Indoor Scenes Based On RGB-D Images

Posted on:2021-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:C T YueFull Text:PDF
GTID:2428330626458916Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Object detection is one of the most basic tasks in computer vision.After having achieved initial success of object detection based on color images,researchers are not content with the status quo and begin to try to study the object detection of 3D data.Now,3D object detection is also a hot field in computer vision.An excellent 3D object detection model can create infinite possibilities for unmanned driving and intelligent robots.3D data comes in many forms.One of them is the RGB-D image that not only contains complete color information representing the color and contour of the object,but also contains depth information representing the distance between the object and the sensing device.Based on the data representation of RGB-D image,this paper proposes a new deep neural network structure for complex indoor scenes to complete the task of 3D object detection.The network model is divided into two parts: proposal frustum point cloud and estimated 3D bounding boxes.In the first part,the color image is convoluted to get the feature map.The proposal regions are selected on the feature map to determine whether each proposal region contains a target object or not.If yes,make a preliminary adjustment to the bounding boxes and get the object classification.Then the color and depth images are combined to obtain the point cloud model.And the frustum point cloud is obtained according to the frame and coordinate system translation transformation of the proposal region as the input of the next part.In the second part,target objects are separated in the frustum point cloud through the point cloud processing network.And then the objects are sent into another point cloud processing network.The real center of the object is estimated based on the classification of the object,and the final 3D bounding box is obtained to complete the object detection.The main function of the point cloud processing network is to get point cloud feature maps.According to the disorder and rotation invariance of point cloud data,the network structure is mainly realized through multi-layer perceptron and maximum pooling.The process of extracting features also involves extracting local features in groups and then obtaining global features.The model proposed in this paper,not only uses the texture feature of color image in RGB-D image,but the transformation of depth image into point cloud is also the structural feature of space transformed from distance feature.The two kinds of features complement each other to maximize the information.Two specific deep neural network features are extracted for the two different data structures.Finally the model finishes the task of object detection by segmentation and regression according to the obtained feature maps.In this paper,while the methods are introduced in detail,the detection results are compared with some other outstanding 3D object detection algorithms.It has a certain advantage in the overall detection performance,with a 3% to 5% improvement in accuracy.In addition,the method selection in some details is also compared,such as the influence of different model initialization methods,the role of grouping and feature extraction in processing point cloud,etc.It shows that every process in the whole complex network structure has a positive influence on the final detection result.Finally,I tried to use deeper network structures to get color image features.The experimental results show that the color image detection process affects the final detection results while improves the accuracy of the model.
Keywords/Search Tags:RGB-D image, 3D object detection, proposal regions, frustum point cloud, point cloud processing network
PDF Full Text Request
Related items