Font Size: a A A

Human Motion Behavior Recognition Method Based On Multi-stream Residual Neural Network

Posted on:2021-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LiFull Text:PDF
GTID:2428330647461932Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of the era of artificial intelligence,video human motion recognition has shown a wide range of application values,including in intelligent video surveillance,virtual reality,video retrieval and other aspects.By analyzing related research on human motion behavior recognition,the existing methods are difficult to solve the interference problems such as complex background environment,lighting changes,and scene changes in the video.The time-dependent relationship between video images cannot be effectively used to affect the final video recognition.result.Based on the above issues,this thesis proposes a spatiotemporal saliency image that characterizes the foreground information of the video,and builds a multi-stream residual neural network model to obtain the spatiotemporal and saliency characteristics of the video.The network model is improved to capture the video sequence Key information.The main research work is as follows:(1)A spatiotemporal saliency image that effectively represents video foreground information is generated.The spatiotemporal saliency image is an image segmented from the foreground target motion region in the video and does not contain complex background information.Using the generalized boundary detection method and K-means clustering method,with the video RGB image and optical flow image as input,the spatiotemporal edge probability map is obtained,and the probability map distance of the foreground target in the video is calculated.(2)A multi-stream residual neural network model based on spatiotemporal saliency is proposed.Through pre-training the network model,the initialization weight parameters are obtained,and the video RGB image,optical flow image,and spatiotemporal saliency image are taken as inputs,the multi-stream residual neural network is fine-tuned,and the class score and score are calculated from the softmax layer of the three channels.Take the average to get the final recognition result.Compared with the dual-stream convolutional neural network,the multi-stream residual neural network effectively uses the spatiotemporal significance information in the video and improves the recognition accuracy on the UCF101 and HMDB51 datasets by 2.9% and 1.9%,respectively.(3)A behavior recognition model combining spatiotemporal saliency multi-flow network and attention mechanism network is proposed.Multi-stream residual neural network is used for feature extraction of three modal video data,and the extracted feature vectors are input into the attention mechanism network.From the large amount of spatiotemporal information,the more critical information for video recognition is selected for further learning.There is a time-dependent relationship between consecutive video frames,and the output results of each channel are fused to obtain the final recognition result.The end-to-end training of this action recognition model has achieved classification accuracy rates of 92.7% and 64.4% on the UCF101 and HMDB51 datasets,respectively.
Keywords/Search Tags:human motion recognition, spatiotemporal saliency, multi-stream convolutional neural network, attention mechanism
PDF Full Text Request
Related items