Analyzing Human Activities in Videos using Component Based Models

Posted on:2014-04-02

Degree:Ph.D

Type:Thesis

University:University of Southern California

Candidate:Khan, Furqan Muhammad

Full Text:PDF

GTID:2458390005492736

Subject:Computer Science

Abstract/Summary:

With cameras getting smaller, better and cheaper, the amount of videos produced these days has increased exponentially. Although not comprehensive by any means, the fact that about 35 hours of video is uploaded to YouTube every minute is indicative of the amount of data that is being generated. This is in addition to the videos recorded for surveillance by grocery stores and by security agencies at airports, train stations and streets. Whereas analysis of the video data is the core reason for surveillance data collection, services such as YouTube can also use video analysis to improve search and indexing tasks. However, due to extremely large amount of data generation, human video analysis is not feasible; therefore, development of methods which can automatically perform the intelligent task of visual understanding, specifically human activity recognition, has seen a lot of interest in past couple of decades. Such capability is also desired to improve human computer interaction. However, the associated problem of activity description, i.e., information about actor, location and object of information, has not got much attention despite its importance for surveillance and indexing tasks. In this thesis, I propose methods for automated action analysis, i.e., recognition and description of human activities in videos.;The task of activity recognition is seemingly easily performed by humans but it is very difficult for machines. The key challenge lies in modeling of human actions and representation of transformation of visual data with time. This thesis contributes to provide possible solutions related to development of action models to facilitate action description which are general enough to capture large variations in an action class while allowing for robust discrimination of different action classes and corresponding inference mechanisms. I model actions as a composition of several primitive events and use graphical models to evaluate consistency of action models with video input. In the first part of the thesis, I use low-level features to capture the transformation of spatiotemporal data during the primitive event. In the second part, to facilitate description of activities, such as identification of actor and object of interaction, I decompose actions using high-level constructs, actors and objects. Primitive components represent properties of actors and their relationships with objects of interaction. In the end, I represent actions as transformation of actor's limbs (human pose) over time and decompose actions using key poses. I infer human pose, object of interaction and the action for each actor jointly using a dynamic Bayesian Network.;This thesis furthers research on relatively ignored but more comprehensive problem of action analysis, i.e., action recognition with the associated problem of description. To support the thesis, I evaluated the presented algorithms on publicly available datasets. The performance metrics highlight effectiveness of my algorithms on datasets which offer large variations in execution, viewpoint, actors, illuminations, etc..

Keywords/Search Tags:

Video, Human, Using, Data, Action, Models, Activities

Related items

1	Extracting Moving People and Categorizing their Activities in Video
2	The Human Activity Recognition Based On Sensor Data
3	Research On Human Action Recognition Based On Depth Sequential Features
4	Research On The Algorithms Of Video-Based Human Action Analysis
5	The Study Of Human Action Recognition Method Of Video Data
6	Methods For Detecting Human Pose And Recognizing Human Action In Video
7	Video analytics with spatio-temporal characteristics of activities
8	Human Action Recognition And Retrieve Under Big Data Environment
9	Discovering audio-visual associations in narrated videos of human activities
10	Video-based Human Action Recognition