Font Size: a A A

Video Action Recognition With Recurrent Convolutional Neural Networks

Posted on:2019-10-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y J XuFull Text:PDF
GTID:2428330593951028Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Deep learning has achieved great success in speech recognition and image recognition.Recently,Recurrent Convolutional Neural Networks(RCN),which combines the merits of Convolutional Neural Networks(CNN)and Recurrent Neural Networks(RNN),is proposed to encode the spatio-temporal information contained in the video.However,the RCN suffers from overfitting due to too many parameters and lack of training data.In this paper,we first put forward a Shared GRU-RCN(SGRU-RCN)model to reduce the number of parameters by sharing the input-to-hidden parameters in the original GRU-RCN architecture.Thus our SGRU-RCN has less possibility of overfitting.Then,we propose a SeqVLAD model that integrates the SGRU-RCN and VLAD encoding method into a whole framework.In particular,we utilize the SGRU-RCN to learn the spatio-temporal assignments of the successive convolutional feature maps which are extracted from the continuous video frames.With the learned assignments,the VLAD encoding methods could aggregate the local descriptors on both the detailed spatial information in separate video frames and fine motion information in successive video frames.Furthermore,we conduct experiments on the task of video action recognition and demonstrate the effectiveness and excellent performance of our method.Finally,we conduct experiments on the task of video captioning to illustrate the good extendable capabilities of the proposed method.
Keywords/Search Tags:Video Action Recognition, Recurrent Neural Networks, Convolutional Neural Networks, Recurrent Convolutional Neural Networks
PDF Full Text Request
Related items