Font Size: a A A

Research On Automatic Estimation Algorithm Of The Number Of People In Natural Scenes

Posted on:2020-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:L H HeFull Text:PDF
GTID:2428330572984067Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The automatic crowd counting refers to the automatic estimation of the number of people in the natural scene image.Accurate estimation of the number of people in natural scenes not only provides early warning of emergencies,but also provides valuable reference information for scientific scheduling of social resources such as vehicles.The automatic crowd count estimation algorithm is mainly divided into non-deep learning based methods and deep learning based methods.The deep learning-based approach uses multi-layer neural networks to learn task-related feature representations directly from observational data.which is more efficient than manual design features based on non-deep learning.This paper focuses on the method of people countins based on deep learning,especially based on the method of convolutional neural networks.Because of the special structural design of weight sharing,local connection,pooling,etc.,the convolutional neural network is suitable for processing tasks in the field of image and computer vision.The automatic estimation algorithm for the number of people based on deep learning is further divided into a direct estimation method and an indirect estimation method based on density map.The density map generally represents the spatial distribution information of the crowd density,and can be generated by Gaussian blurring of the marker template.The direct estimation method directly outputs the number of people.The indirect estimation method based on the density map mainly uses the structural model based on the convolution network to directly generate the density map.and then the sum of all the elements in the density map is used as the estimated value of the number of people.Using the annotated data of the UCSD dataset and the MALL dataset for statistical analysis,we confirm that the number of people calculated from the true density map is not exactly the same as the real number.This shows that the indirect estimation method based on density map is not reliable enough.Existing researches show that in many video analysis algorithms based on convolutional neural networks,motion map can be used as auxiliary input to improve algorithm performance.Motion map contains rich semantic features that enable convolutional neural networks to learn more precise features.Inspired by this,we use the density map which characterizes the spatial distribution of the crowd density as auxiliary input of the deep convolutional network,and construct a multi-input multi-scale convolutional neural network(MIMSCNN)model to estimate the number of the people directly in a single image.Experiments on the UCSD dataset and the MALL dataset confirm that MIMSCNN is superior to state-of-the-art methods such as MCNN,Hydra-CNN,and ResNet + MRF.Considering the powerful implicit feature space representation ability of the generative adversarial networks(GANs)for observation data,this paper proposes a GANs-based crowd count estimation method(GANR).The GANR algorithm supervises the learning process of G using the error generated by the discriminator D when reconstructing the density map.Finally,the feature maps produced by G have key semantic features and contain implicit spatial features that can characterize the crowd density distribution.After the training of the GANs model is completed,the feature maps generated by G have key semantic features and contain implicit spatial features that can characterize the crowd density distribution.Although the GANR algorithm is inferior to MCNN in performance,its training duration is significantly shortened.
Keywords/Search Tags:natural scenes, the automatic estimation of the number of people, convolutional neural network, generative adversarial networks, density map
PDF Full Text Request
Related items