Font Size: a A A

Research On Technologies Of View-based 3D Object Recognition

Posted on:2020-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:X C LiuFull Text:PDF
GTID:2428330590973224Subject:Computer technology
Abstract/Summary:PDF Full Text Request
3D object recognition is one of the important research directions in the field of object recognition.Especially in recent years,it plays a major role in the fields of robot capture,detection,automatic driving,assembly tasks and medical image analysis.The view-based algorithm is a popular trend recently compared to the shape-based detection method,the advantage of it is that it does not rely on complex 3D features and is assisted with large amount of data and mature advanced network framework,which is simple and efficient.Compared with the recognition of single-view images,multi-view images can complement detail features with each other,which plays a great role in the case of occlusion,shading and other difficult scenes.Based on the multi-view convolutional neural network,this paper compares and analyzes the influence of different perspective selection schemes with the model,and reflects on the multi-view feature fusion mode in the model.This paper proposes a pooling method based on perspective weighting which provides a richer view image feature for subsequent classification networks.Furthermore,in view of the regularity and timing of the multi-view data acquisition process,this paper introduces a recurrent neural network unit based on the convolutional neural network,and uses the recurrent neural network to fuse the historical view image information.At the same time,three different attention modules are designed in the network,so that each perspective extracts more useful details in the spatial dimension and channel dimension.Finally,in order to enable the model to have the ability to actively select the next best view,this paper introduces the reinforcement learning module,using the REINFORCE method with baseline,combined with the SGD algorithm for joint training.And in order to solve the perspective "Boundary effect" and sub-network training imbalance problem,this paper proposes a classification confidence-guided strategy gradient flow enhancement method.At the same time,a regularization term with a positional limit is added to the loss function to avoid selecting a viewing angle.They overlap each other to ensure that the selected perspective is more scattered around the three-dimensional object,thereby learning more global object features.
Keywords/Search Tags:3D object detection, multi-view image, RNN, reinforcement learning
PDF Full Text Request
Related items