Font Size: a A A

Research On Target Behavior Recognition Combining Attention Mechanism And GRU Two-stream Network

Posted on:2022-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:S R LiFull Text:PDF
GTID:2518306515464204Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of society,there are more and more ways to obtain video data,and its growth rate is getting faster and faster,which makes the content of video data more and more complicated.How to effectively manage and use video data has become a problem that needs to be solved urgently.Therefore,behavior recognition tasks that use video data as source data are widely favored by researchers from all over the world.The task of behavior recognition is to analyze the behavior of the target in the video data and give the corresponding behavior label.At this stage,although the behavior recognition technology has made great progress with the development of artificial intelligence technology,the speed of behavior recognition research has been slow due to factors such as occlusion,viewing angle changes,and individual behavior differences in video data.This thesis starts from how to extract the key feature information from video data and make better use of the time dimension information in the video data to carry out the research on behavior recognition.main tasks as follows:(1)On the basis of the traditional Two-Stream convolutional neural network,a GRU-based Two-Stream network behavior recognition method is proposed(the GRU networks used in this article are all 2 layers).The VGG-16 model is used in the model to replace the VGG-M model in the traditional Two-Stream network to extract deeper feature information in the video data;the feature fusion stage uses the convolution fusion method to obtain better expressive spatio-temporal feature information;The GRU structural unit is added to the traditional Two-stream network to capture the global timing feature sequence in the video data to make up for the lack of timestreaming network that can only extract local timing feature information.In order to avoid the impact of time-consuming and over-fitting during training,the VGG-16 model pre-trained in the Image Net database is used as the two-stream branch network in the Two-Stream network.Experiments on the UCF-101 data set show that the recognition rate of the GRU-based Two-stream network method designed in this paper can reach 92.3%,which is better than other Two-stream network models on the UCF-101 data set.(2)A GRU Two-Stream network behavior recognition method based on attention mechanism is proposed.Video data is composed of a large number of continuous video frames.A single RGB image frame only contains the state of the behavior at a certain moment.The continuous RGB image frame can represent a smaller stage of motion,and a complete behavior is usually composed of multiple stages of motion composition.In order to extract the effective feature information in the video data,two attention mechanisms are introduced into the model to learn the transfer of RGB image frame features and video features respectively.Among them,the spatial attention mechanism is used to locate the image frame area related to the target behavior while suppressing the expression of irrelevant information.Since each stage of the behavior process is of different importance for distinguishing behavior types,the time attention mechanism is used to learn the time sequence weight distribution of different behavior stages.The model includes a spatial flow network with continuous RGB images as input and a temporal flow network with stacked optical flow images as input.The attention mechanism module is used to assign certain attention weights to spatial and temporal features,and input the temporal and spatial features obtained through the adaptive weighted fusion method into the GRU network to capture the global timing feature sequence in the video data,and finally obtain the behavior category.The accuracy rate on the UCF-101 data set reached 92.8%,indicating that the network model with the introduction of the attention mechanism enhances the ability to express key feature information,thereby improving the recognition ability of the target behavior in the video data.
Keywords/Search Tags:Behavior recognition, Two-Stream convolutional neural network, Network model structure, Attention mechanism, GRU Network
PDF Full Text Request
Related items