Font Size: a A A

Research On Double-layer Conditional Random Fields Based Human Action Recognition

Posted on:2019-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:X D DongFull Text:PDF
GTID:2428330566495893Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Human action recognition is a general and fundamental research in the area of computer vision.It has extensive application prospects in the fields of smart home,intelligent monitoring and video retrieval.The goal of simple action recognition is to extract and analyze the information of human actions from the video,but for complex activities,it is need to model the relevant information between the actions.This paper focuses on the modeling and identification of a series of simple actions that belong to a high-level activity.With the development of deep learning,convolution neural network can effectively deal with image recognition.But,for human actions recognition of video segments,2D CNN could not effectively capture the spatial-temporal information about the process of actions.In this paper,3D convolution neural network and convolution LSTM are used to extract spatial-temporal features of simple action video segments.The proposed method first learns short-term spatial-temporal features of the video segment by 3D CNN,and then learns long-term spatial-temporal features by convolution LSTM based on extracted short-term spatial-temporal features.The conditional random fields(CRFs)model can capture the few time-step interactions of the target states,which make it achieve good prediction performance in temporal sequential labeling.However,for human action recognition,the existing CRFs model formulations have typically limited capabilities to capture higher order dependencies among the given states and deeper intermediate representations within the target states,which are potentially useful and significant to model the complex action recognition scenarios.In this paper,we present a novel double-layer conditional random fields(DL-CRFs)model for human action recognition,which could capture higher order dependencies among the given states and richer contextual information within the target states.Meanwhile,the model can be considered as linear-chain structure by the form of Clique tree,and applied the exact inference to reduce the computational complexity.The predefined parameters of the discriminative DL-CRFs model are learned by improved Block-Coordinate Primal Dual Frank-Wolfe algorithm with Duality gap in a structured support vector machine framework.Compared with other excellent models,the experimental results on both CAD-120 and CAD-60 public data sets show that DL-CRFs has better performance in terms of the recognition performance and the computational efficiency.
Keywords/Search Tags:3D CNN, CRFs, Structured SVM, Exact Inference, Block-Coordinate Primal Dual Frank-Wolfe Algorithm
PDF Full Text Request
Related items