Font Size: a A A

Research On Human Action Recognition Of Video Content Based On Spatial-Temporal Features

Posted on:2014-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q LiFull Text:PDF
GTID:2248330392961041Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the automation and information technology growing rapidly,multimedia has become an indispensable part of daily life, and has been usedin all aspects of life, such as education, health care, sanitation, production,transportation, and so. Video, as a form of multimedia, is becoming more andmore common. Many people do research on video content detectionautomatically including content-based video classification, content-basedvideo summary and content-based video retrieval in order to save the time ofvideo content recognition and video retrieval, and make video processingmore efficient. Human motion recognition in video sequences is a veryattractive and challenging research topic in the computer field, and it can beused in many research fields, including action obtaining, man-machine fusion,environmental controlling, video summarization, security monitoring and soon.This paper studies human action recognition in videos, and makescontributions on features extraction, video representation and human actionrecognition models. Firstly, in feature extraction, a new feature calledOriented Gradient Histogram of Slide Blocks is proposed, which is based onthe idea that video can be regarded as a3D volume where human action canbe seen as a spatial-temporal silhouette surface. Actions of the same typegenerate similar3D shapes, while there are great differences between theshapes of the different action types. Human action type is recognized bybuilding dense overlapping spatial-temporal slide blocks to detect the shapeof the3D silhouette surface of the human action. Secondly, in terms of videoexpression, sparse coding is introduced to generate the new sparse featurevectors which form the final video descriptor by max pooling function.Thirdly, A BOW (Bag of Words) model is used adopting sparse coding and random forest to recognize human actions.Three human action dataset are chosen to test the proposed humanaction recognition algorithm, KTH human action dataset, Weizmann datasetand UCF sports action dataset. The experimental results show that thespatial-temporal silhouette surface can effectively distinguish the videocategories, and detecting the3D human shape by building dense overlappingslide blocks is accurate and effective. The new proposed feature descriptorcontains both human body shape information and motion directioninformation, thereby it can accurately identify the category of the bodymovement. The proposed new model of human behavior recognition achievesgood accuracy rate and performs better than some other similar algorithms.
Keywords/Search Tags:Spatial-temporal silhouette, gradient histogram, slide blocks, sparse coding
PDF Full Text Request
Related items