Font Size: a A A

Research On Visual Target Recognition And Localization Technology Based On Deep Learning

Posted on:2020-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:N WangFull Text:PDF
GTID:2428330596975467Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence technology,people expected unmanned products can help to undertake tasks such as transportation and anomaly monitoring in Smart City as well as the intelligence reconnaissance and sensitive target tracking in future battlefield.This requires the machines to accurately identify and locate the visual targets of interest through the captured image data.In recent years,object detection method based on deep learning in the field of computer vision has achieved great success and has become a hot research direction in this field.However,in face of above-mentioned application scenarios,Current mainstream detection methods based on deep learning have problems such as missed detection and similar target misjudgment.And the over-deep network also brings huge computational effort and model volume,which makes the network difficult to deploy and be used in practice.In view of the above problems,this dissertation will focus on the research of visual object recognition and localization technology based on deep learning.The main works completed are as follows:(1)Aiming at the intrinsic defect that convolutional neural network can not effectively utilize the characteristic information of image spatial structure,a method of extracting spatial structure feature based on recurrent neural network is proposed.By designing a trainable spatial structure feature extractor,as a new layer of neural network,it can be combined with convolution to obtain more expressive fusion features.In order to improve the real-time performance of the feature extractor,a parallelization scheme for forward inference and gradient reverse propagation is proposed.Then,based on the high-performance parallel computing architecture CUDA,its engineering implementation method is given.(2)Aiming at the characteristics of wide-area scenes,a new lightweight basic network with feature relay amplification and multi-scale feature jump connection structure is designed to extract the features of each scale target in wide-area scene.Further,a scheme is proposed to couple the spatial structure feature extractor of this dissertation into this basic network to extract multi-fusion features with stronger expressive ability for subsequent recognition and localization task networks.(3)Under the framework of Faster R-CNN detection method,its recognition and localization task network are improved.The K-Means method is used to obtain the distribution of target scale,which is used to select a more appropriate preset anchor boxes to reduce the network learning burden.Then,a parallel computing method is proposed to solve the top-K candidate bounding boxes selection problem and speed up the running speed of the entire network.Finally,a real-time,accurate and lightweight object detection network that suitable for wide-area scenes and unmanned equipments is proposed,which is used to provide high precision recognition and localization of related visual objects.(4)On KITTI and Pascal VOC datasets,the comparative experiment and results analysis are conducted with Faster R-CNN and SSD models.The advantages and disadvantages of the proposed detection model are studied when facing different scenarios and various classes of targets.The results show that the proposed model has better detection performance and real-time performance in wide-area scenarios.At the same time,by comparing the various evaluation indicators,the limitations of proposed model are analyzed.
Keywords/Search Tags:Targets recognition and localization, Deep learning, Convolutional neural network, CUDA parallel computing
PDF Full Text Request
Related items