Font Size: a A A

Research On Crowd Counting Algorithm For Complex Scenes With Convolutional Neural Network

Posted on:2024-09-03Degree:MasterType:Thesis
Country:ChinaCandidate:D X YinFull Text:PDF
GTID:2568307094481224Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
As one of the key technologies of intelligent surveillance systems,crowd counting aims to accurately estimate the number and density distribution of crowds in actual scenes,and plays an important role in public security warning,traffic control,and urban planning.As transportation becomes more convenient and urbanization continues to advance,large-scale population flows and crowd gatherings in different scenes are becoming more and more frequent,leading to an increasing probability of safety accidents in crowd gathering places.Crowd counting enables real-time monitoring of crowd conditions in various complex scenes,analyzing monitoring information and identifying safety hazards.In recent years,thanks to the continuous development of image processing technology and the improvement of hardware level,the crowd counting method based on convolutional neural network has received a lot of attention from researchers due to its good counting accuracy and generalization.However,affected by the non-uniform distribution of crowds and background clutter in complex scenes,existing crowd counting algorithms suffer from challenges such as non-uniform distribution of crowds,background noise interference and large scale variations.In order to overcome these challenges,we investigates the attention mechanism,multi-scale perception and network structure and proposes a convolutional neural network model to improve the accuracy and robustness of the crowd counting model in complex scenes.The main contents of this paper are as follows:(1)A context-aware crowd counting algorithm is proposed for the problem of non-uniform crowd distribution in complex scenes.First,in order to better extract crowded crowd head features,a atrous space pyramid pooling of cross-layer features is constructed to fully extract different size head features and retain spatial detail information of shallow layer features by designing convolutional and cross-layer channel stitching structures with different void rates;second,in order to fully fuse features of different layers,a feature pyramid fusion module is constructed to fuse deep layer contextual information through channel stitching fused to the shallow layer to suppress the noise information in crowded areas and retain the boundary detail information of the shallow features by pixel summation;finally,the local and global context information is extracted by the contextual attention module to generate and fuse the local and global attention maps that can cope with different crowd distributions,so that the network focuses on the head information of different crowd distributions and reduces the counting errors due to different crowd distributions.To verify the effectiveness of each module of the network and the optimal parameter values of the module,design comparison experiments of the network module with other comparison methods and different parameters within the proposed module.To verify the counting performance of the model in complex scenes,comparison experiments with advanced crowd counting algorithms are designed on Shanghai Tech,UCF_CC_50,UCF-QNRF and NWPU-Crowd datasets,where the MAE and MSE are 61.7,7.8 and 99.6,12.8 on the Shanghai Tech dataset.The results show that the proposed algorithm has high accuracy and robustness in the scenes of non-uniform distribution of the crowd.(2)To address the problems of background noise interference and large scale variation in complex scenes,a multi-scale fused crowd counting algorithm based on attention mechanism is proposed.First,a atrous space pyramid pooling based on residual connections is constructed to capture multi-scale head target features through multiple atrous convolutions with different void rates and retain the spatial detail information of the original feature map through the residual structure;second,a multi-branch feature fusion structure is constructed to fuse branches with different perceptual field sizes to enrich feature information;then,an attention mechanism module based on channel and space is proposed to enhance the network attention to the background noise by means of a pixel-by.Then,the attention mechanism module based on channel and space is proposed to strengthen the discriminative ability of the network for background and human head targets by modeling the long distance dependence of channel and space information of the feature map through pixel-by-pixel matrix operation and adaptively correcting the location information;then,the comparison experiments of the atrous space pyramid pooling based on residual connection,multi-branch feature fusion structure,attention mechanism module based on channel and space with other methods and different parameters contained in the module are designed to verify the Finally,the experimental results on Shanghai Tech and UCF_CC_50 population datasets show that the proposed algorithm is 1.4%,4.2% and 4.9%,1.8% lower than the optimal values of eleven advanced comparison algorithms in terms of MAE and root MSE metrics,respectively,and the proposed algorithm is more effective in the presence of background noise interference and large-scale The proposed algorithm can predict the crowd distribution state more accurately and generate high-quality crowd density maps under the scenes with background noise interference and large scale changes.
Keywords/Search Tags:Crowd counting, Convolutional neural networks, Attention mechanism, Multi-scale fusion, Contextual information awareness, Crowd density map
PDF Full Text Request
Related items