Font Size: a A A

Research Of Crowd Density Estimation Based On Generative Adversarial Network

Posted on:2020-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z ShenFull Text:PDF
GTID:2428330623463688Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
The crowd analysis task in video images is an important topic in the field of computer vision.As a research hotspot in crowd analysis,crowd counting has been widely used in the fields of intelligent security,urban planning and traffic monitoring.The existing crowd counting algorithms can be divided into two categories:number regression method and crowd density estimation algorithm.Crowd number regression method includes a detection algorithm based on human head or human body and a regression-based algorithm.The number of individuals is predicted by learning individual features,and the effect is poor when the scene occlusion is serious and the individuals and individuals overlap each other.The crowd density estimation algorithm appears after the rise of deep learning,and utilizes the powerful feature learning and characterization capabilities of deep convolutional neural networks.The calculation results can be used not only to predict the number of people,but also to generate a density map reflecting the dense distribution of the crowd.The density map method converts the problem of number regression into the prediction of crowd density probability distribution for the whole image,and uses the spatial information of the crowd block to learn the characteristics of the group.Compared with the number regression method,the density map method can better adapt to the complex and intensive crowd scenes.Therefore,the frontier work of the crowd counting task in recent years has been carried out around the research work based on the density map method.Although the current crowd counting algorithms have achieved good performance,the crowd counting task itself still has many difficulties.The first is the robustness of the algorithm to the scene.The application scenarios of the crowd counting algorithm itself are complex and variable.The errors caused by the background in different scenarios will greatly affect the prediction results of the model.Secondly,dense crowd images often have serious occlusion problems and multi-scale problems caused by distance.Especially for the case where the number of people in the image is greater than 500,some visible individuals may only occupy a dozen or even a few pixels,resulting in missing or wrong inspection.The third is the quality problem of the density map.Due to the dimensionality reduction caused by the pooling layer in the convolutional network,the resolution of the generated density map is much smaller than that of the original image.And the traditional?2loss as the objective function is easy to cause the blurring of the density map,which is not conducive to reflecting the dense distribution of crowd and statistics of crowd number.Therefore,how to obtain robust and high-resolution crowd density estimation in complex application scenarios is still a challenging research topic.This paper introduces the research background and development trend of the crowd counting algorithm.For the difficult problems mentioned above,we have done innovative work in the following three aspects from the perspective of improving the algorithm:First,the main generation network of the model adopts the U-net type of encoding-decoding structure.Its structure can well separate the foreground from the background,reducing the number of prediction errors caused by different backgrounds.At the same time,the skip-connection layer of the codec can effectively transfer the features learned by the encoder to the decoder,helping to construct and generate the crowd density map.Secondly,for the counting error caused by the multi-scale problem of the population,we propose two schemes of cross-scale consistency constraint and scale aggregation module.Cross-scale consistency constraints reduce cross-scale errors by minimizing residuals of density values predicted at two scales.The receptive field of the convolution kernel in the module covers 4 scales,which can better adapt to the scale change and learn the scale self-adaptation,thereby reducing the algorithm for missed detection and false detection of people of all scales.Third,from the perspective of applications,in order to generate high-quality and high-resolution crowd density maps that accurately reflect the dense distribution of the population,we have not only the deconvolution layer that can be learned in the network to help rebuild resolution,but also introduce the structure of generative adversarial network,compensating for the fuzzy and average phenomena caused by the traditional loss function,and improving the quality of the crowd density map.The crowd counting algorithm proposed in this paper conducts experiments and tests on four public population counting datasets of ShanghaiTech,World Expo'10,UCSD,UCFCC50.The experimental results show that the proposed method can effectively reduce the crowd counting error,and meets or even exceeds the international best crowd counting accuracy.
Keywords/Search Tags:scale aggregation feature, cross-scale consistency, generative adversarial based, deep learning, crowd counting
PDF Full Text Request
Related items