Font Size: a A A

Research On Action Recognition Based On Action Feature Optimization And Deep Learning

Posted on:2022-03-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:X XiongFull Text:PDF
GTID:1488306539488294Subject:Industrial control engineering
Abstract/Summary:PDF Full Text Request
Action recognition is a hot topic in the field of control science and computer.These years,it has attracted more and more attention from academia and industry.Action recognition has a large social demand and many application scenarios in intelligent industry,smart pension,entertainment,VR/AR and security construction of smart city.Existing methods are deficient in action feature extraction,such as inaccuracy,insufficiency and rough expression of feature extraction,which restrict the effective application of action recognition.In the aspect of action feature optimization,existing methods have some problems in the three levels of low-level single feature,middle-level multi-feature and high-level structured information feature.First,the existing human abnormal action recognition methods only analyze spatial and temporal human actions features at the two dimensional image or by two dimensional convolution.These methods miss the low-level temporal dimension of action features,resulting in the increase of the intra-class gap and the decrease of the inter-class gap of action features.Second,the existing action recognition methods lack the extraction of the middle layer modal fusion features and ignore the research on the skeleton information and RGB information modal fusion.These methods based on single modal information make human action feature extraction inaccurate and expression rough,which leads to poor performance in the tasks of multiple classsification.Third,the existing methods lack the extraction of high level geometric features of human action.There are scale uncertainties and background interference in the expression of action features at pixel level.The coordinate information of skeleton joints is not fully utilized for feature extraction,resulting in poor expression ability,inadequate representation and insufficient robustness of action features.For action recognition task,action features with high quality are the key to accurate recognition.This paper mainly starts with the optimization and extraction of action features,and carries out research in three aspects: the action features optimization of lower level temporal dimension,the action features modal fusion of middle level and the extraction of higher level geometric action features.In order to solve the problem that action features lack the low-level temporal dimension informations,a three dimensional consequtive-low-pooling abnormal action recognition method based on human skeleton information is proposed.First,the pose estimation algorithm is utilized to remove the interference information such as scene and appearance.Then,a clustering selector based on human skeleton action features is proposed to obtain an optimized video sequence with more temporal attention and more action feature information.Finally,a consequtive-low-pooling neural network based on three dimensional convolution is proposed to extract the spatial and temporal information of action,and finally obtain the action classification and recognition results.In order to solve the issue that the action features lack the middle level modal fusion features,an action recognition method based on action sequence optimization and two stream network is proposed.To solve the issue that the existing methods have weak attention in the activity region of action features,the proposed action sequence optimization method optimizes the features of the activity region in the multiplie shot video,enhances the proportion of the active region of action,removes redundant sequences,and optimizes the feature extraction.The proposed three dimensional convolution two-stream fusion network extracts the feature information of skeleton and RGB modes respectively.The network fuses the RGB data and skeleton data for classification while extracting the spatial and temporal features.Finally,the results of action classification are obtained through the fusion of scores.Aiming at the probleam of lacking high level geometric action feature extraction,an end-to-end action recognition method based on human skeleton feature optimization and adaptive graph convolutional neural network is proposed.The proposed skeleton feature optimization(SFO)method makes the graph neural network more effectively to aggregate the feature information of other joints,and improves the expressive ability and attention of the graph convolutional neural network.Aiming at the shortcomings of the GCN being easy to over-smoothing and the weak ability of extracting human action features of existing method,the proposed adaptive graph convolution neural network is mainly innovative in three aspects:adaptive pooling operation(APO),graph structure mask(GSM)and directed graph mapping(DGM).The APO alleviates the over-smoothing problem of graph convolution network by introducing learnable high frequency feature components.The GSM and DGM enhance the ability of graph convolution to extract the structure and directed graph information of human body respectively,and optimize the extraction of action features.In this paper,the proposed method has been experimented and some of the results have been published in many SCI journals.Experiments show that the proposed method optimizes the extraction of action features and achieves better recognition performance than existing methods on some public datasets,which proves the effectiveness of the proposed method.This study promotes the application of action recognition in smart industry and smart city.
Keywords/Search Tags:action recognition, feature optimization, deep learning, three dimensional convolution, two stream network, graph convolution
PDF Full Text Request
Related items