| With the rapid economic and social development,a large number of people are gradually migrating from the countryside to the city.At the same time,there are more and more surveillance video devices in public places.How to obtain the number of people in a given scene in time through surveillance video becomes particularly important.Crowd counting is the calculation of the number of people in a given scene,and it has a wide range of applications in public safety,urban resource calculation,business behavior analysis,traffic planning,and scheduling.Counting the passenger flow of rail transit can help the traffic management department to make decisions,dynamically adjust the running vehicles and time,and effectively improve the operating efficiency of traffic resources.Lightweight network models are an important research direction in deep learning.Unlike previous models that pursued too much accuracy,lightweight models pursue a balance between accuracy and operational efficiency.Although the research of crowd counting has made great progress in recent years,it also faces some challenges,mainly including complex background interference,scale changes,and occlusion of crowd objects.Aiming at these problems,the research work of this thesis mainly includes the following three parts:Firstly,on-the-spot investigation of Chongqing Rail Transit Line 6,and on this basis,a rail transit passenger count data set,CQRailway Station,was constructed.Secondly,for the interference of complex backgrounds,this thesis proposes a lightweight crowd counting method fused with guided filtering.First of all,the input image is denoised by guided filtering,which can reduce the interference of the background while preserving the clear outline of the target edge.Then,the lightweight Mobile Net is used for feature extraction.Compared with the existing mainstream methods that use VGG as the backbone network,the method in this thesis effectively reduces the size and parameters of the model,greatly improves the training speed,and the accuracy of the model is well maintained.The improved Dilated Convolution is used to expand the neuron receptive field,obtain richer semantic feature information,and effectively reduce the problem of feature map loss caused by grid effect,finally output the crowd density map and counting results.Thirdly,in view of the scale variation problem of “near big and far small” in the image of crowd targets,this thesis proposes a lightweight crowd counting method based on multiscale feature fusion.By reusing low-level features,low-level detail features and high-level semantics are fused.The feature enables the network to cope with human head targets of different scales,and effectively solves the problem of target scale changes. |