| Crowd counting is a classic task in computer vision,which is of great significance to public safety.There are problems such as different head scales and complex backgrounds in images or video scenes,which make the current methods unsatisfactory for the prediction because the multi-scale feature has not been fully used.In addition,there are background distracting objects in the scenes,which degrade counting performance due to their likeness to people.Therefore,the above problems are studied based on the convolution neural network separately in this paper.The main research and innovation points are as follows:1)For the complex information differences between backgrounds and crowds,the Aggregation Attention Scale-aware Network is designed.This method proposes a new multi-scale progressive attention module,which adopts attention mechanism to effectively learn the difference between the head and background information in the context information,thereby improving counting accuracy.The innovation is to add multiple attention mechanisms to multi-scale sampling to improve the expression of crowd features between different scale regions,and to improve the counting performance.2)For the case where the head scale varies widely,the Context-aware Multi-scale Aggregation Network is designed to distinguish the multi-scale sampling information of the crowd through local and global context information.The method proposes two new modules: the context-aware multi-scale aggregation module uses different sampling rates to obtain multi-scale features,and adds a global receptive field branch to help other multi-scale branches to sample correctly;the context-aware module employs attention mechanism to utilize contextual information to identify crowd feature in images.Extensive results on three challenging datasets(ShanghaiTech,UCF_CC_50,UCF-QNRF)have reached good level,which show the effectiveness of our method. |