Short Video Classification Based On Modal Fusion

Posted on:2021-05-27

Degree:Master

Type:Thesis

Country:China

Candidate:H Zhang

Full Text:PDF

GTID:2428330614963734

Subject:Electronic and communication engineering

Abstract/Summary:

With the continuous development of multimedia technology and network technology,a large amount of streaming media data is generated on the Internet and mobile platforms,video in streaming media data accounts for a large proportion.Video classification is a basic research direction in computer vision.It is an important intermediate step in solving video annotation and video retrieval,and an important method for managing huge multimedia data.In recent years,deep learning has been continuously developed and achieved very good results.More and more researchers have applied deep learning to video-related technologies.The video classification algorithm based on deep learning not only improves the accuracy of video classification,but also expands the number of categories that can be classified,and has been gradually used in actual video classification tasks.Therefore,video classification based on deep learning has high research and commercial value.This paper is mainly based on deep learning related technologies to study short video classification tasks.First,the common types of deep learning networks are introduced,including convolutional neural networks,recurrent neural networks,and generative adversarial neural networks.It also introduces commonly used programming frameworks and classification models for deep learning.Then,since the main research object of this thesis is video data,we focus on the feature extraction methods of each mode.Before that,the classic algorithms in traditional image feature extraction and video feature extraction are introduced,and then based on deep learning related models,the research focuses on video,audio and text feature extraction algorithms.After that,the overall framework of the algorithm in this paper is designed.Mainly study the principles of clustering network and attention mechanism network,and then study the multi-modal feature fusion method to build an overall network framework.Finally,based on the algorithm proposed in this paper,a large number of comparative experiments were done on the relevant data sets to analyze the experimental data in detail.At the same time,according to the experimental results,various improved methods are proposed,and the experiments are conducted in the public data set to compare with other models.

Keywords/Search Tags:

video classification, deep learning, clustering network, attention mechanism

Related items

1	Research On Video Classification And Detection With Deep Learning
2	Research On Attention Based Image Classification With Deep Learning
3	Research On Text Classification Model Based On Deep Learning And Attention Mechanism
4	Research On Image Classification Method Combining Visual Attention Mechanism And Deep Learning
5	Research On Deep Learning Model Of Point Cloud Classification With Attention Mechanism
6	Research On Emotion Classification Of Texts Based On Deep Learning
7	Research On Text Classification Based On Deep Learning And Attention Mechanism
8	Research On Text Classification Method Based On Deep Learning And Attention Mechanism
9	Two-Stream Video Classification Based On Deep Learning
10	Research And Application Of Text Classification Technology Based On Deep Learning