Font Size: a A A

Study Of The RGB-D Hand-held Object Recognition Based On Convolutional Neural Network

Posted on:2019-07-17Degree:MasterType:Thesis
Country:ChinaCandidate:J SunFull Text:PDF
GTID:2428330578972065Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of computer software and hardware,especially the extensive use of GPUs has greatly enhanced the computing power of computers.This has led to deep-learning technology which has been forgotten many years re-emergence and unprecedented development,which in turn makes the deep learning model represented by Convolutional Neural Networks(CNNs)achieve unprecedented success in various tasks in the field of computer vision.At the same time,people's material life has also been greatly enriched,and smart phones,smart home appliances,and robots which are used to chat with people are becoming more and more common in people's daily lives,object recognition technology has been widely used in these devices for human-computer interaction.However,object recognition is not a simple task,because there are inter-class diversity in similar objects,and there is also a certain degree of similarity between different types of objects,relying on only RGB images to achieve accurate object recognition still faces great challenges.In recent years,the appearance of low-cost RGB-D devices(such as Microsoft's Kinect)has made it easier to obtain depth images.As a kind of auxiliary information,depth image can reduce the difficulty of image recognition in a certain extent,which has become a hot topic for scientists.As an important carrier of information transmission,image data plays an indispensable role in the process of human-computer interaction,and in the human-computer interaction,the handheld object occupies the vast majority.Therefore,the work of this paper mainly focuses on the research of hand-held object recognition.The main work of this article includes the following sections:(1)The deep learning method is used to verify the performance of hand-held object recognition,and based on this,two multi-modal feature fusion methods are proposed based on RGB image and Depth image.The method we proposed in this paper enables the neural network to automatically learn the importance of the two modal features for the final recognition result,thereby achieving complementarity between the two features and more effective fusion characteristics.(2)The few-shot learning method is used for hand-held object recognition,and a new few-shot learning method based on multi-modal feature fusion is proposed.This method introduces depth image as auxiliary information,can effectively improve the traditional few-shot learning ability,and achieve better performance.(3)Collect and produce a hand-held object recognition data set HOD-40 which is composed of 40 types of common objects in out daliy life,including the original image,the segmented image and the bounding box of the object.Each frame of the image contains not only common RGB images but also corresponding depth image...
Keywords/Search Tags:Human-computer interaction, Deep learning, Multimodality, Feature fusion, Hand-held object recognition
PDF Full Text Request
Related items