| With the rapid development in the field of deep learning,the performance of vision tasks such as target detection,semantic segmentation,and super-resolution reconstruction has been greatly improved.Crowd counting task is also a classical research direction in computer vision.It needs to output the corresponding density map based on the input image,which is widely used in video surveillance,congestion warning and traffic prediction.However,generating highquality crowd density maps is still a challenging task due to complex lighting conditions,severe occlusions,viewpoint distortions and different distributions of crowd density.Meanwhile,in order to cope with the complexity of realistic scenarios,current crowd counting models commonly use complex structures with a large number of parameters to ensure performance.The inference time due to the huge computation constrains the practical implementation of crowd counting models.This paper deals with performance optimization and efficient network design for crowd counting tasks.The main work and contributions of this paper are as follows.Attention-guided feature pyramid network,which adaptively generates high-quality density maps with the exact spatial location of the crowd.Relying on the feature pyramid structure,our approach can incorporate low-level features with rich spatial location information into highlevel features with rich semantic information,so that the fused features have both semantic and spatial perception at the same time.This paper also designs a spatial attention module that adaptively emphasizes crowd areas and suppresses background clutter in the forward feature extraction process.In addition,the model dynamically learns channel-based weighting coefficients to distinguish the different importance of the corresponding channels for high and low-level features,thus achieving adaptive and learnable feature fusion.Experiments on four publicly available classical crowd counting datasets show that our proposed method outperforms other state-of-the-art methods.For example,on the high-density dataset UCFQNRF,the MSE metric of the proposed method is improved by 9.1%.Knowledge distillation scheme for the crowd counting task,whose goal is to reduce the number of parameters in the crowd counting model while preventing performance crashes.This paper addresses the problem of the difficulty of effective knowledge migration for crowd counting tasks.A knowledge distillation scheme combined with structural reparameterization is proposed to provide a more comprehensive and targeted reparameterized expansion structure for the current crowd counting model.Through the adaptive expansion of parameters,the student model can effectively close the capacity gap with the larger model.This paper also designed the gradient decoupling module,which can avoid the occurrence of gradient deviation during distillation and improve the efficiency of using the expanded parameters.This paper also designed the gradient decoupling module,which can avoid gradient divergence during distillation and improve the efficiency of using the expanded parameters.The results show that the proposed method generates population density maps with high quality while speeding up inference by a factor of 4 to 10,and outperforms the current mainstream model compression methods in terms of performance metrics. |