Font Size: a A A

Object Detection And Semantic Segmentation For RGB-D Images With Convolutional Neural Networks

Posted on:2018-10-12Degree:MasterType:Thesis
Country:ChinaCandidate:G H DengFull Text:PDF
GTID:2348330563452398Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the popularity of depth sensors,the study of RGB-D images has gradually become a hotspot in the field of computer vision.Thereinto,the main research directions include RGB-D images object detection and semantic segmentation.Object detection is a process that finds out the location of the object in the image and identify the object class,which plays an important role in intelligent monitoring.Semantic segmentation is a process of identifying each pixel category in an image,which is a basic technique in UAV navigation and autopilot.RGB-D images include RGB image and depth image information.In the study of RGB-D images object detection and semantic segmentation,the feature of RGB image and depth image are extracted separately,and the correct rate is not high enough and the speed is not enough fast,it's difficult to meet the requirements of the industry.Therefore,this paper makes an intensive study of RGB-D image object detection and semantic segmentation,the specific work is as follows:First,in order to be able to extract the RGB image and the depth image feature at the same time,instead of separately extracting,this paper proposes a scheme to fuse the RGB image and the depth image,and the fused image is called HHG image.This image can simultaneously express the visual content of the RGB image and the depth image,so that when training the network model,it is not necessary to train the network model of the individual RGB image and the depth image separately,and the speed will be promoted when the object detection task and the semantic segmentation task are executed.Second,in order to improve the accuracy and speed of RGB-D images object detection,this paper presents a scheme to complete the RGB-D images object detection by Faster-RCNN.The scheme first corrects and adjusts the network parameters of the Faster-RCNN network structure,then retrains the Faster-RCNN network model with the HHG image,and finally uses the model to complete the object detection task of RGB-D images.In the process of detection,we propose a scheme to preserve candidate borders,which is called NMS'.NMS' is a program that reduces the candidate borders from the traditional non-maximum suppression.It changes the candidate border decision-making mechanism,and sets the overlap between the candidate borders and the number of borders around the candidate border as decision-making factor.Third,in order to improve the semantic segmentation performance of RGB-D images,this paper presents a scheme to complete the semantic segmentation of RGB-D images by FCN.The scheme first corrects and adjusts the network parameters of the FCN network structure,then retrains the FCN network model with the HHG image,and finally uses the model to complete the semantic segmentation task of RGB-D images.Compared with the results of previous experiments,We found that the detection accuracy with our RGB-D images object detection program is 9.7% higher than the best of the previous detection accuracy,and detection speed is at least 100 times faster than the counterpart;The segmentation accuracy with our RGB-D images semantic segmentation program is 2.3% higher than the best of the previous segmentation accuracy;The accuracy of using the NMS' scheme is improved compared to the object detection scheme that does not using NMS'.
Keywords/Search Tags:RGB-D Image, Object Detection, Semantic Segmentation, Faster-RCNN, Fully Convolutional Networks(FCN)
PDF Full Text Request
Related items