Research On Video Action Detection Based On Sensitive Feature Selection And Action Region Enhancement

Posted on:2020-07-02

Degree:Master

Type:Thesis

Country:China

Candidate:Y Yang

Full Text:PDF

GTID:2428330599958956

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Video action detection,which includes action classification and localization,is a fundamental task in computer vision.To be specific,the algorithm needs to find the start and end time of each action instance in video,and meanwhile,assign category label for them,severally.Video action detection plays a key role in many practical applications,e.g.intelligent monitoring,video retrieval,somatosensory games,medical health and intelligent device control.Although tremendous progress has been made on video classification for short trimmed video thanks to the success of deep learning,video action detection remains a much more challenging problem since most videos in realistic life are long and untrimmed.Recently state-of-the-art methods focus on generating more accurate action proposals and training to get better classifiers and regenerators.To make full use of the main action area in video and deal with inherent discrepancy between action classification and localization,we propose a multi-task structure framework,which learns to enhance action region adaptively and select sensitive features autonomously.In summary,the main contributions of this paper are as follows:(1)Based on the action region selection,an adaptive region enhancement method is proposed.Core of this method is to let the network pay attention to the action area of the video and enhances the contribution of the action area to the video detection task while suppressing the influence of the related action area on the detection task.Concretely,the network can learn to focus on the subject area in the video automatically via well-designed adversarial training strategy and loss function.Furthermore,we introduces a mask mechanism to explicitly guide the network to improve the contribution of the main action area for better recognition.(2)A sensitive feature selection method is proposed.Our motivation is that the selection of key frame is crucial for action classification and the relation among frames is essential for action localization.To handle with the internal difference between them,we proposes a sensitive feature selection method.It consists of two sub-modules,one for choosing key frame and one for learning the relation among frames.Specifically,the former scores the importance of each frame in the video,and the latter models the correlation between each pair of frames.The experimental results show that the proposed module highly meets the demands of video action detection in realistic life.Based on the above designs,the ultimate model proposed in this paper achieves 38.97% in terms of mAP@0.5 on the THUMOS14 dataset,which outperfoms the basic model(SSN+BSN)by more than 2% and surpasses the baseline(SSN)method by about 5%.In addition,we observe consistent performance gain on various basic networks equipped with our proposed module.

Keywords/Search Tags:

Video action detection, A daptive area enhancement, Sensitive feature selection, Key frame, Graph convolution

PDF Full Text Request

Related items

1	Video Action Detection Based On Deep Learning
2	Attention Mechanism Based Action Recognition
3	The Research On Key Frame Selection And Feature Matching In Video Retrieval
4	Action Recognition Based On Human Skeleton Graph Convolution And Image Convolution Fusion
5	Video Object Detection Based On Adaptive Convolution Network And Visual Attention Mechanism
6	Human Skeleton Action Recognition Based On Spatiotemporal Graph Attention Convolution Network
7	Video Action Recognition Based On 2D Convolution Network Under Spatio-Temporal Feature Enhancement Mechanism
8	Research On Action Recognition Based On Action Feature Optimization And Deep Learning
9	Research On Video Human Action Recognition Algorithms Based On Deep Learning
10	Research On Video Action Recognition And Detection Method Based On Deep Learning