Font Size: a A A

Research On Early Human Action Recognition Based On Adaptive Graph Convolutional Network With Adversarial Learning

Posted on:2022-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:G X LiFull Text:PDF
GTID:2518306314973239Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Action recognition is a hot research topic in the field of computational vision.Previous action recognition was mainly based on video data,but in recent years,with the development of depth sensor technology,human skeleton data has become easier to obtain,and action recognition based on skeleton data has gradually become an important research direction.Skeleton data is a high-dimensional representation of human actions.Compared with video data,it has the advantages of small data volume,no interference from light and complex background,and strong robustness.Early action recognition is an extension of action recognition.The difference is that early action recognition needs to recognize its class before the end of the action,so that the machine can react in time and shorten the delay time.Early action recognition has broad application prospects in the fields of security monitoring,intelligent driving,human-machine collaboration and so on.This paper summarizes the related works of skeleton-based action recognition and early action recognition in recent years.It can be seen that the researches on early action recognition based on skeleton is not enough,so this paper focuses on this problem,and the main contents and innovations are as follows:(1)The adaptive graph convolutional network with adversarial learning(AGCN-AL)for skeleton-based early action recognition is proposed.The global information of an action is very important to recognize it,but early action recognition cannot observe the complete action execution process,so it is more difficult than action recognition.In this paper,adversarial learning is used to make the features of the same class of partial and full sequences as close as possible in the feature space,so as to learn the potential global information in the partial sequences and improve the recognition accuracy of the partial sequences.Through experiments on the NTU RGB+D dataset and the SYSU 3DHOI dataset,the method in this paper shows better results than other methods.(2)In order to prevent the network from overfitting partial sequences with small observation ratios,a temporal-dependent loss function is proposed.The smaller the proportion of the observed action to the whole action(i.e.the smaller the observation ratio),the less information the partial sequence contains and the more difficult it is to recognize.To prevent the network from overfitting and make it converge quickly,this paper proposes a temporal-dependent loss function to reduce the penalty for the classification error of sequences with small observation ratios.Through experiments,the temporal-dependent loss function shows better results than the standard cross-entropy loss function.(3)This paper proposes two methods of multi-view fusion and skeleton and video fusion to further improve the accuracy of early action recognition.Aiming at the occlusion problem of single view,this paper proposes a variety of multi-view fusion methods,which significantly improves the accuracy of early action recognition.In view of the limitation that skeleton data cannot provide the necessary item information,this paper proposes to fuse skeleton data and video data to supplement the necessary scene information.Experimental results show that the fusion of the two data can greatly improve the accuracy of early action recognition.
Keywords/Search Tags:Skeleton data, Early action recognition, Adversarial learning, Temporal-dependent Loss Function, Data fusion
PDF Full Text Request
Related items