Fined-grained Action Recognition Using Graph Model

Posted on:2021-05-27

Degree:Master

Type:Thesis

Country:China

Candidate:W Luo

Full Text:PDF

GTID:2518306503472574

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Action recognition in video is always one of the most popular and unresolved issues in the computer vision community.Action recognition can be directly applied not only to intelligent monitoring,intelligent drive,human-computer interaction,etc,but also is the the basis for multiple video research tasks.With the development of deep learning technology,many new algorithms based on deep neural networks have emerged in the field of action recognition.Mainstream approaches can be roughly divided into three families:3D convolution,two-stream,and recurrent neural networks.By relying on these methods,the temporal and spatial characteristics of the video can be effectively exploited and utilized.These models work well in UCF101,HMDB51,Kinetics and other general datasets,but they are not satisfactory on some tasks of fine-grained action recognitions.We believe that these methods focus on the modeling of appearance and motion characteristics,but more or less ignore the interactions between objects in the videos.Therefore,their performances are not good enough.This paper presents a novel framework for modeling and reasoning interactions in video using graph models.The objects in the video and some class-related scene regions are detected first.Their visual features and spatiotemporal position features are extracted as nodes of the graph model,and the interaction between objects is represented as the edge of the model.We tried to use the inner product,bilinear weighting,and multi-layer perceptron to obtain the weights of the edges of the graph model.Then a graph convolution network(GCN^[1])is used on this graph,which explores the relation between objects and achieves the fusion of scene information and object information.What's more,we propose weight sharing and relation normalization to improve inner-product-based relation modules,which is widely used for the relative exploiting problem.Experiments have shown that our algorithms work better in fine-grained action recognition than traditional CNN meth-ods because it is good at representing the interaction between objects:the accuracy up to 43.6%at the validation set of EPIC Kitchen for the verb classification,and 47.0%(m AP)at the test set of VLOG,which outperform the-state-of-the-arts.

Keywords/Search Tags:

Computer Vision, Action Recognition, Graph Model, Scene Modeling

PDF Full Text Request

Related items

1	Research On Computer Vision Based Human Action Recognition Technology
2	Vision-Based 3D Scene Modeling Research And Implementation
3	Research On Semantic Understanding For Action Recognition
4	Research And Implementation Of Object Grasping Recognition Algorithm Based On Computer Vision
5	Research On Conjunct Static-Dynamic Efficient Method Of Action Recognition
6	Research On Human Action Recognition Based On Computer Vision
7	Research Of Human Action Recognition Algorithm In The Video
8	Research On Human Action Recognition Based On Video
9	Spatiotemporal Modeling For Video Human Action Recognition
10	Research On Human Action Recognition Based On Computer Vision