Font Size: a A A

Research On The Pedestrian Detection Based On RetinaNet

Posted on:2020-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z JiaFull Text:PDF
GTID:2428330596982555Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
As a traditional Computer Vision task,Pedestrian Detection is widely used in intelligent transportation and intelligent monitoring and plays an important role in ensuring the safety of intelligent equipment.The Convolutional Neural Networks have the characteristics of local connection and weight sharing,which make it widely used in Computer Vision tasks.Therefore,after the rise of Convolutional Neural Networks,the Pedestrian Detection has made breakthrough progress.However,because the distance of the pedestrian target from the camera is different,the size and scale of the pedestrian in the picture are also different,which adds great difficulty to the Pedestrian Detection.For the detection of multi-scale targets,the feature pyramid structure in convolutional neural networks has been favored by researchers.RetinaNet is a convolutional neural network algorithm that uses this structure to achieve multi-scale target detection.However,the feature extracted by feature pyramid lacks sufficient edge texture features and a certain aliasing effect,which affects the detection accuracy.This article makes corresponding improvements on the basis of RetinaNet.The main researches of this paper is summarized as below.(1)As a general object detection network,RetinaNet does not match the specific Pedestrian Detection tasks.Therefore,this paper uses the INRIA dataset to train the improved RetinaNet,and determines its main network structure,the scale and aspect ratio of the pre-selection box,and the weight coefficient and focusing parameter of Focal Loss.In addition,this paper also uses the multi-scale training method to randomly select one of the five image resolutions as the training images resolution,thereby improving the detection ability of neural networks for pedestrians of different scales.(2)In this paper,the feature fusion architecture of the dual feature pyramid is proposed to improve the accuracy of multi-scale pedestrian detection.This method solves the problem that the features of each layer,especially the deep features,lack edge feature information by introducing shallower convolution features.When the IOU is 0.5 and 0.7 respectively,compared with the feature pyramid structure,the method reduces the miss rate of the INRIA dataset by 0.23% and 1.03% respectively.In the large-scale detection experiment of Caltech dataset,the miss rate is reduced by 3.22%.In order to further improve the detection accuracy,this paper also adds a dilated convolution module to improve the receptive field of deep convolution features,strengthen the characteristics of deep pedestrian categories,and improve the Pedestrian Detection accuracy of convolution features of each layer through feature fusion.(3)In this paper,feature enhancement is used to establish the interdependence between convolution feature channels to recalibrate features to selectively emphasize beneficial pedestrian features.In this paper,the improved prediction module is used to further integrate and adjust the merged features to be more suitable for Pedestrian Detection tasks,and to use the Soft Non-Maximum Suppression instead of Non-Maximum Suppression in the post-processing to improve the accuracy of Pedestrian Detection in the congestion state.The miss rate of the All test in the Caltech dataset reached 56.65%,and when the IOU is 0.5 and 0.7 respectively,the miss rates of the INRIA dataset reached 5.19% and 9.65%,respectively.In comparison with other algorithms,the improved RetinaNet has better overall performance.
Keywords/Search Tags:Pedestrian Detection, Convolutional Neural Networks, RetinaNet, feature pyramid structure, Miss Rate
PDF Full Text Request
Related items