Font Size: a A A

Human Action Recognition Based On Sparsely Coded Spatio-temporal Video Features

Posted on:2015-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y DingFull Text:PDF
GTID:2298330452963988Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Humanactionrecognitioninvideosmainlyreferstoclassifyingandlabelingvideofles or clips in which human actions are contained. It has been a popular research areaduring the recent years and has many applications in felds such as human-computerinteraction, video surveillance, video retrieval, etc.In related works of human action recognition in videos, methods based on localspatio-temporal video features are intensively studied. In the feld of image featureencoding, sparse coding is a method that has also been generally applied. This thesisfocuses on the research of the application of video representations based on the com-bination of spatio-temporal features and sparse coding in human action recognition.The main research works and innovations of this thesis are as follows:1. Firstly, spatio-temporal feature detectors and descriptors are reviewed and com-pared. Theirapplicationscenariosandperformancesalsoareanalyzed. Then,theprinciple and main algorithms of sparse coding are introduced. The diferences,advantages and limits of the algorithms are discussed in detail.2. Next, given the problem of some current video representations based on spatio-temporal features, that they inappropriately encode the distribution of featuresof videos as a whole in which the target action is repeated, a video sub-shotdetection method based on the clustering of spatio-temporal video features isproposed. Combined with clustering evaluation methods which help choose theappropriate number of sub-shots, the method enables each sub-shot to containone occurrence of the target action. 3. Moreover,amethodbasedonthecombinationofspatio-temporalpyramidmatch-ingandmax-poolingisproposedtogeneratetherepresentationsofthesub-shots.The sub-shot representations are further max-pooled as the video representation.4. Finally, the efectiveness of the above-mentioned video representation frame-work is verifed using experiments and the setup, the parameters choosing pro-cesses as well as the results of the experiments are presented.
Keywords/Search Tags:spatio-temporal features, sparse coding, human ac-tion recognition, video representation
PDF Full Text Request
Related items