Font Size: a A A

Human Posture Prediction Based On Gated Recurrent Neural Network

Posted on:2021-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z ChenFull Text:PDF
GTID:2428330605456092Subject:Instrument Science and Technology
Abstract/Summary:PDF Full Text Request
Human posture prediction is a very challenging task in the field of computer vision.It can estimate the next behavior of a target object like a pedestrian and predict its next trajectory.In recent years,the application of human posture prediction has become more and more extensive,such as human-computer interaction,personnel tracking,and automatic driving.Due to the advancement of deep learning,human posture prediction methods using convolutional neural networks or recurrent neural networks have also been widely developed.However,the prediction of human posture will be affected by uncertain factors,such as movement speed,movement amplitude,movement background,etc.The uncertain factors mentioned here will cause the predicted first frame to be discontinuous and the time for accurate prediction will be shorter.In this paper,the first frame in the human posture prediction is not continuous,the accurate prediction time is relatively short,the network structure is complex,and the training is difficult.The human posture prediction network model based on the bidirectional GRU network(EBiGRU-D),the human posture prediction network model based on the attention mechanism(At-seq2seq),and the human posture prediction network model based on the bidirectional GRU network and the attention mechanism(BiAGRU-seq2seq)are proposed respectively.In the EBiGRU-D network model,the encoder is composed of a bidirectional gated recurrent unit network,and the decoder part of the model is composed of a GRU network.The bidirectional GRU network allows the input original data to be input into the encoder from the positive direction and the negative direction and encode the input data.After the input data is encoded,a state vector is formed and then input into the decoder Decoding operation.The significant advantage of the bidirectional GRU network is that the output at the current time is related to the state at the time before and after,so that the output fully considers the data characteristics at the time before and after.In the At-seq2 seq network model,the composition of the encoder is a GRU network,and the composition of its decoder part is also a GRU network,but the difference from the encoder is that the attention mechanism is introduced here.The purpose of adding attention mechanism in the decoder part is to encode the output of the encoder into a vector sequence containing multiple subsets,so that the decoder can select the most relevant part from these sequences for decoding operation.The BiAGRU-seq2 seq network model combines the advantages of the EBiGRU-D network and At-seq2 seq network model.The encoder of this model is composed of a bidirectional GRU network,and the attention mechanism of the decoder is introduced into the GRU network.At the same time,the model also introduces a residual architecture.The purpose of this architecture is to simultaneously feed the input and output data of the decoder to the residual architecture to simulate the speed of human movement.The three network models proposed in this paper have been verified on the Human3.6m video pose data set,which is currently the largest publicly available video pose data set in the world.It includes various activities performed by professional actors and is recorded through the Vicon gesture capture system.Experimental results show that the proposed model can not only reduce the error of human posture prediction,but also accurately predict the human posture of multiple frames.
Keywords/Search Tags:Human posture prediction, Recurrent neural network, Attention mechanism, Deep learning
PDF Full Text Request
Related items