Font Size: a A A

Research On Intelligent Analysis Technology Of Railway Video Based On Multi Scale Spatial And Temporal Characteristics

Posted on:2022-09-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Q WanFull Text:PDF
GTID:1482306560489304Subject:Carrier Engineering
Abstract/Summary:PDF Full Text Request
Intelligent analysis technology of railway surveillance video is one of the important means to ensure the safe operation of high-speed railway.Due to the steady growth of passenger volume year by year,abnormal behavior analysis and crowd density analysis of pedestrians have become important research contents of intelligent railway monitoring system.In the actual transportation scene,due to the uneven illumination,background information is too complex and other factors,the visual task based on video has not been solved satisfactorily.In order to realize the fast and accurate recognition of pedestrian behavior and density in high-speed railway surveillance video,this paper proposes a fast optical flow estimation algorithm based on multi-scale spatial feature extraction,a human behavior recognition algorithm based on multi-scale spatiotemporal feature extraction,and a crowd density estimation algorithm based on spatiotemporal feature extraction.In order to obtain the short-term motion information quickly and accurately from image sequence,a fast optical flow estimation algorithm based on spatial pyramid and deep convolution network is proposed in this paper,which solves the problems of large amount of calculation and poor real-time performance of high-precision optical flow algorithm.In order to overcome the problem of "large displacement" in optical flow estimation,the input sample of depth convolution network is multi-scale representation of image.In this paper,the mapping relationship between the image and the optical flow graph is established based on the depth convolution network,and the optical flow field is estimated layer by layer from the minimum scale to the maximum scale of the image.The deep convolution network is composed of feature extraction network and context network,in which the feature extraction network realizes the learning of continuous image optical flow information,and the context network further completes the post-processing of estimated optical flow to make up for the loss of spatial information caused by image scale transformation.The experimental results show that the model achieves an ideal balance between the efficiency and accuracy of optical flow estimation through the convolution self-learning ability and the "coarse to fine" structure,and improves the recognition accuracy of fuzzy images,which lays good foundation for the follow-up task of railway video information analysis.In order to realize behavior reliability recognition in video surveillance scene,this paper proposes a behavior recognition algorithm based on time domain multiscale spatiotemporal feature extraction,which solves the problem of scale diversity in time dimension in video information recognition task,and the limitation of single convolution network in video information analysis.The proposed algorithm uses the above-mentioned optical flow fast estimation algorithm and 3D/2D hybrid network to build a two-stream video information multiscale feature extraction model.The model is composed of a longterm spatiotemporal feature extraction network based on 3D convolution and a short-term spatiotemporal feature extraction network based on 2D convolution.The two-stream convolution network makes up for the defect that 2D convolution cannot extract video time sequence information through the mutual assistance of two sub network features,realizes the fusion of deep learning features and handcraft features,and multiscale expression of video information,which improves the recognition accuracy of model.3D/2D hybrid network mechanism also provides ideas for other video information analysis technologies.Because of the above research on 3D/2D hybrid network mechanism,a crowd density estimation algorithm based on spatiotemporal feature extraction is proposed to solve the problem of population statistics in public scenes.The crowd density estimation model is composed of coding network and decoding network.The coding network is composed of 3D/2D hybrid network to realize the feature extraction of crowd image sequence.The decoding network is composed of deconvolution network to realize the up sampling of the output feature map of the coding network to generate high-quality crowd density map.Experimental results show that the introduction of temporal features improves the expression ability of the model for crowd features,suppresses the influence of background noise in video on the counting algorithm,and reduces the error of crowd statistics algorithm.Based on the above research results,this paper focuses on the task of abnormal behavior recognition and crowd density estimation in railway scene,which provides solutions for the intelligent development of railway monitoring system.To improve the generalization performance of the model in the railway scene,the abnormal behavior database of the railway scene and the platform crowd database are established by using the video collected by the existing high-speed railway monitoring system.Experimental results show that the proposed behavior recognition algorithm and crowd density estimation algorithm have good recognition accuracy in railway scenes.
Keywords/Search Tags:convolutional neural network, multi-scale spatiotemporal features, optical flow, action recognition, crowd density estimation, railway intelligent monitoring system
PDF Full Text Request
Related items