Font Size: a A A

Human Action Recognition Algorithm Based On Multi-modal

Posted on:2021-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:C DingFull Text:PDF
GTID:2428330602470241Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development of information age,the more convenient and intelligent life is popular with more people.Artificial intelligence technology is an important way to achieve this life.As an essential branch of artificial intelligence,human action recognition plays an indispensable role in many research and application fields.In vision-based human action recognition research,the data modality that can be used to identify action classes are divided into RGB,Depth and skeleton,and each of modality provides different information.For example,RGB data can provide the appearance information and depth information are acquired by depth data.The information provided by each modality is likely to be complementary to other modality.Based on this theory,this paper designs and constructs a multi-modality information complementary network,which takes full use of the complementarity of RGB and Depth modalities.In the meantime,video-based data has the long-range temporal information that is not available in other types of data.Effective utilize of this information can not only increase the accuracy,but also improve the efficiency.In addition,aiming at the problem that the accuracy of traditional action recognition algorithm decreases in similar action class,the visualization experiment is shown that the similar action is divided into the similar action category with the sub-motion sharing phenomenon and the similar action category with the influence of other objects.These two kinds of similar problems are solved by sub-action division and object detection network assisted method.On the premise of constant efficiency,the final action accuracy is promoted.The specific work is as follow:Firstly,motion energy guided multi-modal information complementary network is proposed.The network takes use of the abundant appearance information acquired by RGB data and the depth information provided by the depth data,as well as the robustness of the image brightness and observation angle,and the multi-modal fusion is accomplished by the information complementarity of the two modalities.In addition,an energy-guided video segmentation method is adopted to model the long-range temporal structure and distinguish the action classes with the sub-action sharing phenomenon.In the stage of feature fusion,the multimodality-stitch fusion is proposed.Connected the feature maps of multiple convolutional layers,the convolutional network can not only share the local features of the two modes in the shallow layer,but also obtain the fusion of global features in the deep convolutional layer.The algorithm is verified in NTU-RGB+D dataset and achieves the best recognition rate.Secondly,object detection-assisted action recognition algorithm is proposed to address the problem that the accuracy of action recognition algorithm in similar action with the influence of other objects is reduced.This algorithm considers the fact that the other objects that are helpful in determining complex and similar action are observed when human are judging an action class.In this algorithm,object detection network is exploited to support action recognition algorithm to detect the category of objects because of the advantage of object detection network in the object detection.At the same time,the network fusion module is designed in order to combining the superiority of the two networks and avoiding the problem of wrong judgment of the action class caused by the different results of the two networks.This method guarantees the accuracy of similar action recognition that are affected by other objects and have a positive impact on the result.
Keywords/Search Tags:human action recognition, multi-modal fusion, motion energy guided video segmentation, multimodality-stitch fusion, object detection network, network fusion module
PDF Full Text Request
Related items