Video Recognition Based On Deep Visual Representation

Posted on:2020-11-12

Degree:Master

Type:Thesis

Country:China

Candidate:X S Qiao

Full Text:PDF

GTID:2428330626453276

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Recently,with the rapid development of the computer vision,video recognition be-comes a popular research direction.Video recognition is the fundamental of the video surveillance,autonomous vehicles,virtual reality,etc,which also raises researchers' at-tention.Specifically,the task of Action Recognition aims to predict the action of persons within a video,based on the automatic analysis of pattern recognition and machine learn-ing algorithms.Now,most of the papers concentrate on using detection,tracking or de-signing robust feature to encode the motion information,while they ignore the high-level semantic information among samples.To solve this problem,in this paper,we propose an unified framework for analysing Spatial Temporal representation across Grassmanni-an manifold and Euclidean space(ST-AGE).ST-AGE designs a new Spatial-Temporal Representation Volume,then projects this volume onto different spaces to metric the similarity and analyse high-level semantic information.The major contributions of our work are concluded as follows:(1)We design a new video representation named Spatial-Temporal Representation Volume(STRV).This representation contains spatial and time information simultaneous-ly.Based on the capability of the convolutional neural network,we choose the fully-connected layer for constructing the spatial part,meanwhile keep its sequence informa-tion.Besides,based on improved dense trajectory,we obtain the motion information in the region-of-interest for reinforcing the temporal part.(2)We propose analyzing the relationship among samples using manifold learning.We decompose the STRV into two parts.For spatial representation,we project it onto the Grassmannian manifold while project the temporal representation onto the Euclidean space.Then we fusion the two metrics into a kernel linearly.Finally,an efficient multi-kernel for SVM is conducted to classify the videos.(3)We evaluate the performance of ST-AGE under four datasets,namely KTH,HMDB-51,UCF-50,UCF-101.Meanwhile,we compare several results under different condition from multiple aspects.According to the experiments,the algorithm of ST-AGE gets a satisfying performance on the four datasets.ST-AGE concentrate on modeling the three-dimensional structure of the video,then analysing across multiple spaces.This algorithm achieves a very satisflying results on several datasets.

Keywords/Search Tags:

Video recognition, Dense trajectory, Deep learning, Grassmannian manifold, Euclidean space

PDF Full Text Request

Related items

1	Action Recognition Based On Deep Learning Framework
2	The Study On Supervised Manifold Learning Algorithms In Pattern Recognition
3	The Representation And Recognition Of Trajectory Data Based On Path Signature Feature And Deep Learning Methods
4	Discriminative Manifold Learning In Face Recognition Applications
5	Research On Representation Learning And Prediction Model Of Crowd Movement Trajectory Based On Deep Learning
6	Study On The Learning Method Of Spatiotemporal Manifold Feature Of Human Action In 3D Motion Space
7	Research On Human Behavior Recognition Method Based On Improved Deep Learning Network
8	Research On Human Behavior Recognition Method Based On Video Image
9	Research Of Face Recognition Algorithm Based On Improving Manifold Learning And Deep Neural Network
10	Design And Implementation Of Video-Based Face Gender Recognition System Using Manifold Learning