Font Size: a A A

Crowd Counting By Deep Learning

Posted on:2021-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ChenFull Text:PDF
GTID:2518306107460434Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Crowd counting is an important direction in computer vision field.It has been widely used in video surveillance,public safety,traffic monitoring and other fields.The commonly used methods adopt a fully convolutional neural network to learn the mapping from the original image to the density map,and the count is obtained by integrating the density map.This paper researched on the crowd counting algorithm which adopts the full convolutional network to regress density map,and proposed the following improvement against the problems of existing methods.The scale variation is the difficulty in crowd counting.Most state-of-art approaches tackle the multi-scale problem by adopting multi-column CNN architectures.However,the structure brings huge resource cost.It is infeasible to adopt multiple deep columns,while a deep network has been proved to have a good performance.Therefore,this paper proposed a shared single deep column structure which extracts multi-scale feature in high layers.To extract multi-scale feature,we propose the Scale Pyramid Module which employs different rates of dilated convolutions in parallel instead of traditional convolutions with different sizes to reduce parameters.Experiments show that the proposed structure of single column with extracting multi-scale feature from high layer can get more accurate estimation with less parameters compared to the multi-column structure,and the proposed Scale Pyramid Module can improve the robustness to scale variation.Existing density map-based methods excessively focus on the individuals' localization.However,in highly crowded scenes,each head is occupied with few pixels,it is unreasonable to force the network to accurately locate individuals.In response to this problem,this paper propose a novel labeling scheme,termed Count Map.Each pixel in the Count Map represents the number of heads falling into the corresponding local area in the input image,which discards the detailed spatial information and forces the network pay more attention to counting rather than localizing individuals.By finding the balance between counting and localizing,the count map can achieve better results.Further,a joint optimization method based on regression and classification is proposed for Count Map.Two branches are appended at the back end of the network,one branch is used to regress the local count map and the other branch is used to classify each area,the category is the count of people.Experiments show that the proposed Count Map achieves better estimation than the density map,the joint optimization method based on regression and classification accelerates the convergence and obtains better crowd counting estimation.Existing crowd counting methods use full convolutional networks for density estimation.Due to the local receptive field of the convolution operation,the fully convolutional network has inherent limitation in modeling the relationship between global regions.However,regions of different densities are correlated,which could be used to further improve the crowd counting performance.Inspired by Graph Convolutional Network(GCN),Region Relation-Aware Module is proposed to capture and exploit the important region dependency.Specifically,our model builds a fully connected directed graph between the regions of different density which are softly divided by the spatial attention mechanism.Each node(region)is represented by a feature vector initially generated by pooling the feature maps according to the region division.Then,GCN is learned to map this region graph to a set of relation-aware regions representations.These region representation are remapped to the input feature space and fused with the input feature maps for the more accurate prediction.Experiments show that the proposed Region Relation-Aware Module can significantly improve the accuracy of estimation.
Keywords/Search Tags:Computer vision, Crowd Counting, Deep learning, Graph Convolutional Network
PDF Full Text Request
Related items