The Design And Implementation Of Few-shot Video Classification Based On Deep Learning Framework

Posted on:2022-03-09

Degree:Master

Type:Thesis

Country:China

Candidate:J L He

Full Text:PDF

GTID:2518306341450604

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of computer technology and mobile internet technology,watching and sharing videos has become a part of people's daily life,and video data became an important information carrier.Manually processing video data is obviously unrealistic.Using traditional neural network models requires a large amount of annotated data,and manual annotating data is time-consuming and laborious.Therefore,the problem of video classification in few-shot scenes has become a hot research problem in computer vision.The problem of few-shot video classification refers to completing the task of video classification with few labeled samples.This thesis mainly studies two basic tasks in video classification:action classification and background detection,and studies how to realize action classification and background detection when there are few video annotations.Video data naturally has spatial and temporal attributes.It is particularly important for video classification tasks to fully extract information in these two dimensions.Existing research works fail to consider the relative relationship and importance of frames in the video,so the temporal characteristics of the video cannot be fully extracted in a few-shot scene.Firstly,in order to solve the problem of insufficient feature extraction and utilization in few-shot action recognition scenarios,this thesis proposes spatial feature representation method based on the siamese network for the few-shot action classification problem,using the siamese network combined by ResNet-18 and AlexNet network to extract spatial characteristics of the video.Secondly,a video temporal feature extraction method based on sparse attention mechanism is proposed.The core idea of the method is to highlight the influence of key frames while calculating the relative relationship between video frames,so as to fully extract the temporal features of the video.Finally,based on the above-mentioned extracted features,a deep relationship module based on the alignment idea is proposed to make full use of the temporal and spatial features in the sample.Aiming at the problem of few-shot background detection,this thesis proposes a few-shot background detection algorithm based on siamese network.Experimental results on multiple real datasets show that the few-shot action recognition algorithm based on the sparse attention mechanism and the few-shot background detection algorithm based on the siamese network can make full use of the extracted features and significantly improve the accuracy of classification results.

Keywords/Search Tags:

few shot learning, action recognition, siamese network, attention mechanism

PDF Full Text Request

Related items

1	An Improved Action Recognition Method With 3D Convolution Neural Network
2	Research And Application On One-shot Object Detection Based On Attention Mechanism
3	Research On Human Action Recognition Method Based On Deep Learning
4	Research On Few-Shot Object Detection Technology
5	Research On Human Action Recognition Method Integrating Visual Attention Mechanism And Deep Learning
6	Action Recognition Based On Two Stream Spatial-Temporal Attention Network
7	Research On Cross-Domain Image Recognition Algorithm Based On Pair-Wise Generalization Network And Attention Mechanism
8	Attention Mechanism Based Deep Network For Human Action Recognition In Video
9	Studies On Action Recognition In Video Based On Deep Learning
10	Multi-shot And Zero-shot Learning For Human Action Recognition