Font Size: a A A

Object Detection Based On RGB-D

Posted on:2019-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhaoFull Text:PDF
GTID:2428330623468820Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence,more and more scholars begin to study object detection in the field of computer vision,and is no longer merely content with the recent research on RGB images,the object detection methods which are based on the depth of the image has become a hot topic.But the accuracy and real-time performance of indoor multi-class objects detection is impressionable to illumination change,shooting angle,the number of objects and the object size.In order to improve the accuracy of detection,some studies have begun to usage deep learning methods.Although deep learning can effectively extract the underlying characteristics of objects at different levels,large samples and long learning time make it impossible to apply widely immediately.On the other hand,in terms of improving detection efficiency,there were many scholars who wanted to find all possible areas that contain objects according to the edge information of objects,thus reducing the number of detection windows.And later,some people used the method of deep learning to preselect it.To settle above existing matter,the paper proposes two methods by stages,which adopt RGB-D graphs.The first one is object proposal with super-pixel merging by steps,the other is object classification adopting the technology of multi-modal data fusion.(1)In the stage of object proposal,the method segments images into super-pixels in the first and merges them by steps adopting the method of self-adaptive multi-threshold scale based on the color and depth information,according to the theory of eyes observing significant object's color information firstly and then its depth information.The method proposes to segment the graph with Simple Linear Iterative Clustering(SLIC)and merges the super-pixel by two steps which calculates the area similarity respectively with color and depth information.In this way,the detection windows which have similar color and depth information will be extracted out to decrease window number through filtering them by area and adopting non-maximal suppression to detection results with the overlapping region.At the end of the process,the number of detected windows will be far less than using a sliding window scan and each area may contain an object or part of an object.(2)And then in the object-recognition stage,the proposal method fuses the multi-modal features including color,texture,contour and depth,which are extracted from RGB-D images,employing the means of multi-kernel learning.In general,objects are confusable when identified with just one feature because of the multiplicity of objects.For example,we are difficult to distinguish an apple and the other which was painted in a picture.Actually,multi-modal data fusion can cover more abundant object characteristic in RGB-D images relative to single feature or simple fusion with two features.At last,the fusing feature kernel is input into the SVM classifier and the procedure of object detection is complete.By setting different threshold segmentation interval parameters and multi-kernel learning gauss kernel parameters,the paper does contrast to the method proposed and the current mainstream algorithm,it is concluded that the textual method has a certain advantage in object detection on the overall performance.The detection rate of the method is increased by 4.7% comparing with the state-of-art via the comparative experiment based on the standard RGB-D databases from University of Washington and the real scene databases which are obtained by Kinect sensor.Meanwhile,the method of sub-step merging of super-pixel is superior to the present mainstream object proposal methods in object location and the amounts of sampling windows are fourfold less than other algorithms approximately in the situation of same recall rate.And through comparing individual feature and the fusion-feature recognition accuracy,it is concluded that multi-feature fusion method is much higher than the individual characteristics and characteristics of the two fusion in the overall detection accuracy,also has outstanding performance on objects categories with different gestures.Conclusion The experiment results show that the proposed method could take full use of both the color and depth information in object location and classification and be important to getting higher accuracy and better real-time performance.At the same time,the object proposal method of sub-step merging of super-pixel can also be used well in the field of object detection based on deep learning.
Keywords/Search Tags:object detection, RGB-D image, sub-step merging of super-pixel, multi-modal data fusion, object proposal, Support Vector Machine
PDF Full Text Request
Related items