Font Size: a A A

Research On Action Recognition Algorithm Based On Mid-level Network Structure

Posted on:2018-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:W J TianFull Text:PDF
GTID:2348330521951182Subject:Engineering
Abstract/Summary:PDF Full Text Request
Action recognition and video classification is a vital problem in the field of computer vision.With the rapid development of social media sharing,increasingly growing volume of multimedia data,such as large scale of video classification and annotation are needed to be dealt with urgently,especially aiming to the video classification with human actions.Action recognition is widely used in the fields of video surveillance,video retrieval and human-computer interaction.As an active research of pattern recognition and machine learning tasks,present research focus of human recognition has converted from the optimization design of low-level features to the extraction of mid-level semantic features.Based on the comparative analysis of the existing research achievement and problems,the thesis aims to mine action parts possessing discriminative property,and further model the interactions between action parts,the specifics are as follows.1.The problem existed in the action part extraction with traditional clustering algorithms are analyzed in this thesis,for example,the algorithms need to set cluster number artificially,and are easy to fall into the local extremum,and furthermore,the Euclidean distance in high-dimensional feature spaces does not work well,etc.Consequently,an improved spectral clustering algorithm is proposed in this thesis,which aims to extract action parts satisfying the close spatial location and similar motion velocity of the trajectories in a specific part.Specially,three types of distances including spatial distance,appearance distance and velocity distance are adopted,and meanwhile,both the local information and global information of data distribution are integrated for the inter-cluster similarity,by which a novel similarity measure is constructed to assure the measure accuracy to be high enough,the obtained action parts more conforming to the understanding of the human to body motion,and the great clustering results.2.Aimed at the problem of inadequate purity and discriminativeness for the candidate parts,the discriminant constraint method is employed to remove trajectories which don't belong to the parts,measure the intra-cluster trigger frequency and inter-cluster trigger frequency,and further remove candidate parts without strong discriminative ability,all of which together assure the discriminative ability to be high enough.The spectral clustering and discriminant constraint method are incorporated to form the proposed discriminant clustering algorithm,which can separate each part motion from the whole body movement,and avoids the problems existed in the classical clustering methods,and the obtained parts possess strong discriminative ability.3.Considering the part interactions in different actions are distinct,the interactions between action parts,namely spatial-temporal relationship and casual relationship,are further modeled.Then,the part representation and interactions are combined to construct mid-level semantics Action-net,which express the correlations between action videos and categories,conclude small capacity but more abundant semantic category information and motion information data.Finally,the mid-level semantics Action-net are introduced into the well trained latent Support Vector Machines(LSVM)classifiers to obtain the final recognition accuracy.Experiments are conducted on the four benchmark datasets to demonstrate the effectiveness of the proposed Action-net.In the end,the main contents are concluded and the following research direction is presented.
Keywords/Search Tags:Action recognition, Spectral clustering, Discriminant constraint, Interactions, Latent Support Vector Machines
PDF Full Text Request
Related items