Font Size: a A A

Research On Crowd Counting And Localization Method Based On Deep Learning

Posted on:2024-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z MaFull Text:PDF
GTID:2568307151967019Subject:Communication Engineering (including broadband network, mobile communication, etc.) (Professional Degree)
Abstract/Summary:PDF Full Text Request
Crowd counting and localization is an important research in the field of computer vision,which is related to the future development of intelligent security systems.In the counting task,solving the scale change problem has become the research focus of many methods.In the localization task,the complex network framework and the insufficiently accurate labeling information become the main factors that hinder its development.Therefore,this paper conducts research based on the above problems and tries to construct some simple and efficient crowd counting or localization networks.First,for the scale change problem,this paper constructs a multi-scale and attention fusion based crowd counting network,aiming to solve the scale-variation problem of the head in crowd images through a multi-scale architecture with an attention mechanism.The network adopts the classical convolutional neural network to extract the deep features of the image,and then uses the proposed multiple expansion convolutional module and multi-scale channel attention module to obtain the multi-scale information of the crowd and pay attention to it.Finally,the deeply fused features are fed into a regression network to output a density map whose pixel sum is the number of the crowd.The crowd counting network has been experimentally demonstrated to have good counting performance.Second,based on the established crowd counting network,this paper proposes a joint training based crowd counting and localization method.It attempts to facilitate both parties to achieve their respective performance improvements through the information interaction of the two tasks of counting and localization.Specifically,the method shares the feature extraction and multi-scale fusion stages with the existing counting network,and adds a localization branch to the back end of the network.The localization branch first highlights the spatial location information of the crowd through a spatial attention module,and then uses the regression network to upsample the features and output a heat map for localization.In the training phase,this heat map will provide location information to the original density map regression network.In the testing phase,the output of the density map regression network will guide the heat map for peak point extraction.Experimental results show that the network achieves accurate counting and localization.Finally,to avoid inaccurate labeling and simplify the training and testing process,this paper establishes a self-attention guidance based crowd localization and counting framework.The framework uses point-supervised object detection to accomplish the crowd localization task and counts the number of people by accurate localization results.The framework uses the original point annotations of the datasets to train the network,avoiding the generation of imprecise annotations and allowing end-to-end training.To perceive the context and cope with scale changes,the Transformer architecture is introduced to build a feature extractor together with a convolutional network,and then a pyramidal feature fusion network is designed to integrate global and local multi-scale information.In the downstream tasks,the framework uses a regression head and classification head to obtain the location and confidence of prediction points,and solves the matching problem between the prediction set and the target set with the help of Hungarian algorithm.Through extensive experiments,it is demonstrated that the framework has advanced performance in both counting and localization.
Keywords/Search Tags:crowd counting, crowd localization, attention mechanism, scale change, object detection
PDF Full Text Request
Related items