Font Size: a A A

Research On Pedestrian Detection And Action Recognition Based On Deep Learning

Posted on:2021-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:P F LuoFull Text:PDF
GTID:2428330602494389Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Pedestrian detection is a challenging problem in the field of computer vision,and is a prerequisite task for many vision applications,such as autonomous driving,visual monitoring,and robotics.In the last decade,pedestrian detection has aroused extensive research interest and made great progress.Especially in recent years,due to the development of deep convolutional neural networks,significant progress has been made in the field of general object detection.Driven by this,pedestrian detection has also developed rapidly.The general object detection model based on the deep convolutional network has been continuously extended to the field of pedestrian detection and achieved good performance.For pedestrian detection tasks,a model based on receptive field enhancement network is proposed.Most current pedestrian detection models based on deep learning use a divide-and-conquer strategy to solve the problem of scale variances in pedestrian detection.During detection,the size of the receptive field of each feature layer is fixed,and it cannot adapt to the continuous change of pedestrian scale in actual situations.Moreover,most of these detection models use backbone networks in image classification as feature extractors.These backbone networks only have a square shape receptive field.There is also a serious mismatch between the receptive field and the aspect ratio of pedestrians.These factors affect the performance of the pedestrian detection model.To solve the above problems,this thesis proposes a model based on receptive field enhancement network as a solution.The model uses the receptive field enhancement module to diversify the receptive fields of the features extracted from the backbone network to provide a suitable receptive field to match the pedestrian's scale,and uses the multi-level aggregation module to further aggregate the multi-scale feature layers,that is,to merge different receptive field under different layers,so as to obtain the fused feature pyramid for subsequent pedestrian detection.Through the transformation of the above-mentioned modules,the features extracted by the model are more robust to pedestrian scale variances.Finally,we conducted a series of comparative experiments on some benchmark datasets to verify the effectiveness of the proposed method.In addition,this thesis also conducts related research on human action recognition,and designs and implements an action recognition model based on two-stream network structure.The model is composed of four sub-modules:two-dimensional convolutional network,three-dimensional convolutional network,feature channel fusion and attention mechanism,and decoupled detector.In the action recognition model,through the two-stream network structure based on the two-dimensional convolutional network and the three-dimensional convolutional network,the static appearance features and temporal context motion information in the video are extracted separately.In the detector module,the idea of decoupling classification and regression tasks is introduced to ensure that classification and regression tasks can learn useful feature information for their respective tasks.
Keywords/Search Tags:Pedestrian detection, Action recognition, Convolutional neural network, Deep learning
PDF Full Text Request
Related items