Font Size: a A A

Online Human Action Analysis Based On Deep Learning

Posted on:2020-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y C WuFull Text:PDF
GTID:2428330578967282Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Online video action analysis is an important research topic in computer vision,which has a wide range of practical applications in visual monitoring,human-computer interaction and intelligent robot navigation.Unlike offline video motion analysis,which recognizes actions after they are completely captured,the purpose of online action analysis is to detect the status of action happening as early as possible in the dynamic state,so as to make an accurate and rapid response to the upcoming or happening events and provide better practical value.This paper explores and studies online human action analysis,including video action recognition,temporal action detection and repetitive action counting.In video action recognition task,the BN-Inception(Inception with Batch Normalization)Network is used to model the appearance characteristics of action to extract spatial features,and the C3D(3-Dimensoinal Convolutional)Network is used to model spatio-temporal features of action including action contextual information in this paper.Then,the action features representation of robustness is extracted.For the recognition of multi-classification,we use one-to-multi LSVM and classification network based on softmax loss respectively to train classification model,complete the online video action recognition.The validity is verified by experiments.In temporal action detection task,in order to accurately detect the category and the start-end time of action for the long untrimmed videos,the method to optimize the extracted candidate proposals based on the action time semantic continuity rule is proposed.Firstly,the sliding Windows of the same scale and different scales are integrated according to the rules of action time semantic continuity;then,we reacquire the classification confidence score for the integration result,and further elimination of inaccurate detection by non-maximum value suppression.This method breaks through the inherent limitation of sliding window,can produce the action time segment of any length and suppress the redundant detection,making the detection result more in line with the expectation of the person.In repetitive action counting task,different from the traditional method which can only deal with static and stationary motion,we first use the spatial and temporal features extracted by deep ConvNets to obtain the motion law of repeated action: Firstly,the PCA algorithm is used to reduce the dimensionality of the high-dimensional features to obtain the principal components with time-series motion characteristics;Then,the smooth motion trajectory is extracted by the Fourier transform of adaptive threshold filter and the unconstrained repetitive motion counting task is completed.The experimental results show that this method is also effective for dynamic and non-stationary repeated action analysis,and has certain robustness for processing video data of complex real scenes.In practical application scenarios,we not only consider recognition accuracy,but also require time efficiency for online video action analysis.In this paper,this work is tested and analyzed online,the experiment results show that when the current frame and the 479 frames before the current frames are acquired online as an input,the processing is 2.93 times faster than the actual duration.It shows the effectiveness of the online video analysis and lays a theoretical foundation for the practical application of the task.
Keywords/Search Tags:human action recognition, temporal action detection, action time semantic continuity, repetitive action counting, deep leaning
PDF Full Text Request
Related items