| Crowd counting has many application values,such as reasonable control of the number of people in public places can effectively avoid the occurrence of safety accidents such as crowding and stampede.Accurately locating individuals in the population and estimating individual scales are important foundations for tracking individuals in the population and analyzing complex behavior in the population.Therefore,researchers have conducted extensive research on the issue of population counting.The existing crowd counting methods mainly train and test in the same or similar scenarios.However,when the types of training and testing scenarios for the model are different,the performance of crowd counting will sharply decrease.In practice,the captured crowd scenes are ever-changing,and the existing crowd counting methods have a serious lack of adaptability to the scene,which restricts the promotion of crowd counting methods in practical applications.Improving the cross scenario capability of crowd counting methods faces many challenges.For example,in different types of crowd scenes,it is necessary to handle varying degrees of perspective deformation(i.e.,individuals who shoot closer to the camera have a greater experience,while individuals who shoot farther from the camera have a smaller experience).In addition,when the crowd scene changes,it can cause uncertainty in the background and interfere with the accuracy of crowd counting.At present,most individual localization based head scale estimation methods estimate the head scale based on the distance relationship between neighboring individuals.However,for scenarios with non-uniform population distribution,it is obvious that ideal head scale estimation results cannot be obtained.In response to the shortcomings of existing crowd counting methods in cross scene capabilities and the problem that individual localization and scale estimation methods can only adapt to specific scenarios,this article conducted the following research:(1)A cross scene crowd counting method based on multi-scale feature fusion and cascaded supervision(MFFNet)was proposed.This network will upsample multi-scale features extracted from crowd images and combine them into multi-scale feature blocks.Then,convolution and deconvolution operations are performed on the multi-scale feature blocks to obtain features of different resolutions.Then,a bottom-up structure is used to fuse features of different resolutions.In the process of feature fusion,multi-scale context aggregation based on hollow convolution is used to predict the population density maps corresponding to different resolution features.During the training process,we proposed a cascaded supervision strategy to synchronously optimize the training of features with different resolutions,introducing background suppression loss.We conducted cross scene crowd counting experiments on four types of scenarios,and the results showed that the proposed method has stronger scene adaptability compared to the current best method.The related work has been published in IEEE Transactions on Instrumentation and Measurement.(2)A head scale estimation method for adaptive perspective deformation in crowd images has been proposed.On the basis of estimating individual positions using the proposed population counting method,this study further achieves head scale estimation that adapts to perspective deformation.This method first segments the crowd image according to different distances,and then uses the crowd counting model MFFNet to predict the position of individuals in the image block.Then,the initial head scale is estimated based on the distance relationship between each individual and neighboring individuals.Adjust the initial head scale based on the principle of proximity to the camera and similar head scale.Due to the unknown slope of perspective deformation in crowd images and the fact that perspective deformation is not strictly linear,a curve fitting is used to estimate the scale of the head in the crowd image.We conducted testing on the ShanghaiTech dataset.The experimental results show that the proposed adaptive perspective deformation head scale estimation method can effectively improve the accuracy of head scale estimation in the population.(3)We have designed and implemented a crowd counting system,which includes functions such as crowd density map prediction,individual positioning,population statistics,and head scale estimation.Users can quickly obtain the distribution of the population,the total number of people,and estimate the distance between individuals through this system. |