| The construction of urban security system has promoted the rapid increase of video surveillance equipment,and the video image data has shown explosive growth.Crowd analysis based on complex scene images provides important reference information for building a security protection system through crowd status detection and early warning.As an important task in crowd analysis,the performance of crowd localization largely determines high-level tasks such as scene image semantic analysis and key object recognition.Crowd positioning aims to use individuals as processing units to obtain location information of independent individual targets,which has important application value for analyzing crowd state changes and maintaining public safety.Under the condition of dense crowd scene,the crowd positioning model based on neural network is easily affected by the characteristic noise caused by mutual occlusion and background interference between objects.Dependence,the coupling of target features and background features leads to blurred individual target boundaries;dense crowd scene targets are surrounded by background objects,and target features are interfered by background object features,resulting in weak target feature responses.Aiming at the above problems,this paper conducts research on crowd positioning method based on feature denoising.The main contributions are summarized as follows:(1)For the feature decoupling problem in the category space,the semantic information of the target area is used to guide the branch network to learn the category-aware feature denoising weight,so as to promote the decoupling of the target class and background class features in the category space,and guide the backbone network to distinguish different categories of features.The differentiated attention of,enhances the characteristic response of independent individual targets in the spatial domain.A crowd location method based on category-aware feature denoising is proposed.Semantic segmentation is used to obtain the spatial semantic information of the target,the expanded convolution layer is used to increase the feature receptive field,the category-aware feature denoising network model is designed,and the specific category feature denoising weight is learned.The individual target prediction is transformed into a binary image prediction problem,and the adaptive threshold network is guided by the multi-scale denoising feature to learn the differential learning of the target class and background class features to achieve accurate prediction of independent individual targets.(2)Aiming at the feature decoupling problem in the Fourier space domain,the random Fourier transform is used to map the global sampling of image features to the Fourier transform domain,decoupling the correlation between features in the Fourier transform domain,and guiding the model Focus more on the true relationship between target features and labels,and suppress feature noise.A crowd location method based on Fourier feature correlation is proposed.From the perspective of Fourier feature association between samples,use the sample attention module to embed the same feature of each sample into the same Fourier feature spectrum,and learn to match the target prediction error through feature decorrelation optimization based on cross-covariance operator sample attention.The weighted loss of attention weight is used to guide the model network to enhance the differential learning of target features and background features to achieve accurate prediction of independent individual targets. |