As an important method of crowd analysis,crowd counting has great potential value in the fields of public security,epidemic prevention,agricultural measurement and so on.With the advent of the era of deep learning and attention models,crowd counting algorithms had been developed rapidly.However,for the counting of dense crowd,the traditional algorithms also faced the challenges of non-uniform distribution,scale variation and complex background.This paper rethought the feature learning of deep neural network for crowd images,and proposed a basic backbone network model,which could effectively alleviate the problems caused by scale variation and the interference background ambiguity in crowd images.From two aspects of interactive enhancement and adaptive fusion,this paper further optimized the model structure by introducing attention mechanism,and made full use of the feature information in crowd counting network.The main research work of this paper are as follows:(1)Research on crowd count method based on multi-scale learning and background denoising.Aiming at the common problems of scale variation and background ambiguity in crowd images,this paper proposed a basic crowd counting model SAENet,introducing the Res2 Net as front-end network based on multi-scale learning instead of the traditional VGG network,to promote neural networks to learn more abundant scale feature representation from the source;in order to retain more detailed information,the back-end network adopted the idea of U-Net for reference,short-cut connection and concat to decode the feature map from the front-end network,which was composed of two branches gained a new multitask model.One of the branch was DENet,which was used to generate crowd density maps,and the other was AENet,which introduced a mask-based attention mechanism to generate attention maps to facilitate DENet focus on crowd targets,in order to avoid the interference of background ambiguity on crowd counting.In order to further explore the validity of SAENet,this paper carried out experiments from the depth,scale and structure of network to verify the correctness of the design of SAENet.(2)Research on introduction the attention mechanism to localy enhance the features of SAENet.Aiming at the problem of lacking interaction in the process of learning multi-scale features and multi-task features of SAENet,this paper proposed an IA-block module based on attention mechanism to calculate the correlation information between tow branches by using the feature information from each decoder layers.this model based on Non-local model constructed spatial attention network and channel attention network based on SE-block to enhance the feature information in SAENet from two dimensions of space and channel.This paper also confirmed the effectiveness of the IA-Block model through ablation experiments.(3)Research on introduction of feature map adaptive fusion strategy to coduct global fusion of SAENet’ s mutil-scale hierarchical feature map.Aiming at the problem of insufficient use of feature information in different layers,this paper proposed the global scale adaptive fusion strategy mechanism-GSAFS.Considering the correlation of feature map from adjacent layers in SAENet could result in the feature fusion produce ambiguity.GSAFS construct three auxiliary networks(R1,R2 and R3),the feature map in DENet was extracted by convolution and upsample layers,and used a learnable parameter to achieve global multi-scale adaptive fusion between density map and attention map.IA-block and GSAFS complement each other in the SAENet and work together to facilitate the efficient use of feature information in the backbone network.The effectiveness of GSAFS was also verified by ablation experiments. |