Font Size: a A A

Research On RGB Video And 3D Skeletal Sequence Based Cross-view Human Action Recognition

Posted on:2021-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:S PanFull Text:PDF
GTID:2518306047984929Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Human action recognition is an important subject in the field of computer vision.It has a very broad application prospect in many fields,such as medical monitoring,intelligent home,virtual reality,human-computer interaction,intelligent security,content-based video retrieval,athlete aided training.Traditional human action recognition methods usually do not consider the influence of view variation.These methods assume that the view is the same during training and testing.But in reality,the view usually changes.When the view changes too much,the recognition accuracy of these methods will reduce significantly.This thesis focuses on cross-view human action recognition problem.In this problem,the model is trained and tested in different views.This thesis aims to develop a human action recognition system that is robust to view variation.This thesis focuses on RGB video and 3D skeleton sequence based cross-view human action recognition.The major works are outlined as follows:1.Aiming at RGB video based cross-view human action recognition,a cross-view human action recognition system via transferable dictionary learning is designed.The system extract view-dependent features from videos firstly.Then,it uses the transferable dictionary learning to learn a set of transferable dictionaries and each dictionary corresponds to one view.In this method,the corresponding features which belong to the same action but in different views are used as input.The learning target is to force the corresponding features to have the same sparse representation.By adding label consistency constraint in the dictionary learning process,the same action has the same sparse representation,and the different actions have different sparse representations.As a result,adding label consistency constraint improves the differentiation of sparse representation.By using transferable dictionaries from corresponding views,the features from different views are mapped to a common sparse representation space and obtain view–independent sparse representations.2.Aiming at 3D skeleton sequence based cross-view human action recognition,a cross-view human action recognition system via view adaptive CNN is designed.The system uses preprocessing methods,such as alignment center point,random skeleton rotation and skeleton image,to convert 3D skeleton sequences to skeleton images which are suitable for CNN training firstly.Then it uses the view adaptive CNN which is designed in this thesis for training.The view adaptive CNN consists of a view transformation network and a main classification network.The view transformation network can automatically obtain the best virtual observation view to train the main network from the skeleton image.After transforming the skeleton image to the best virtual observation view,the new image is used to train and test the main classification network.
Keywords/Search Tags:Human Action Recognition, Cross-view, Transferable Dictionary Learning, View Adaptive
PDF Full Text Request
Related items