Video Classification Based On Fusion Feature And Dual Stream Network

Posted on:2021-01-04

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Zhang

Full Text:PDF

GTID:2428330602481634

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

With the development of the Internet,both the PC video platform traffic and the mobile video volume have experienced explosive growth,and the types of video are also diverse.How to efficiently and accurately classify and manage user uploaded video materials,which is of great significance to video platforms and users.In this paper,the video classification model is studied from three modules.First,the method of shot boundary detection based on adaptive double threshold sliding window is studied.Secondly,the key frame extraction algorithm based on adaptive non-uniform partitioning is studied.Thirdly,Research and design of video classification model based on improved dual-stream depth network framework.The main research contents are as follows:(1)First,this paper proposes a lens detection model based on an adaptive double-threshold sliding window.Currently,most shot detections are based on histograms and thresholds,extracted color histograms of video images then judged based on thresholds.The disadvantage is that the lens boundary is only judged based on the thresholds,and the thresholds only select the global histogram feature vector.The difference value setting ignores the back-and-forth relationship between consecutive video images.Based on this,this paper proposes a method of adaptively calculating double thresholds,then detects the position of the lens switch based on the relative position information of the video image.The experiment proved that the adaptive double-check sliding window lens boundary detection was more accurate,with an increase of 6.2 percentage points in accuracy,7.0 percentage points in recall rate and an average increase of 4.1%in F1 value compared with other detection methods.(2)Secondly,this paper proposes a key frame extraction model for adaptive non-uniform block.Traditional key frame extraction relies on computing video frame content features or motion features or clustering.However,these methods of feature extraction do not take video frames into consideration.Most of the main targets are located in one part of the image,while other areas do not change significantly.Non-important areas,if the global features are extracted without distinction,will undoubtedly cause the weakening of the subject features.This paper proposes to non-uniformly and segment video frames,assign different weights to different regions,and finally calculate and extract video key frames based on adaptive thresholds.Experiments show that compared with other extraction methods,the accuracy of the model in this paper increases by 5.0 percentage points on average,the recall rate increases by 2.75 percentage points,and the F1 value increases by 2.75%.(3)Last,this paper proposes an improved dual stream network model DM-TS.The original dual-stream network model uses a two-way convolutional neural network as the feature extraction unit to extract the temporal and spatial feature vectors of the video,and finally combines the two feature vectors for classification.However,convolutional neural networks focus on extracting the spatial characteristics of video,ignoring the time domain information between video frames.Therefore,this paper proposes to use ResNet residual network and bidirectional long-term and short-term memory network(BiLSTM)as spatial stream module to further extract video frames.Before and after the time domain correlation,multiple 3D CNN convolutional neural networks are used as time-flow modules to extract temporal and spatial feature information of different scales.Finally,the dual-stream classification results are combined to complete the classification of video.Experiments show that the accuracy of the DM-TS model is improved by 4.3%.

Keywords/Search Tags:

Lens Detection, Key Frame, Residual Network, BiLSTM, DM-TS Model, Video Classification

PDF Full Text Request

Related items

1	Research On Video Lens Edge Detection And Key Frame Extraction Algorithm
2	Research On Vulnerability Detection Based On BiLSTM Model
3	Research On KBQA Based On Subgraph Ranking And Residual BILSTM
4	Research On Passive Forensics Of Video Motion-compensation Frame-interpolation
5	Design And Implementation Of Content-based News Video Retrieval Prototype System
6	Video Smoke Detection Based On 3D Residual Dense Network
7	The Distortion Model Of Reconstructed Picture Considering Reference Frame In Video Coding
8	Design And Implementation Of LiDAR Data Classification Algorithm Based On Residual Network
9	BiLSTM-CNN Text Classification Based On Attention Mechanism And Residual Connection
10	Research On Traffic Video Detection And Vehicle Classification Based On Deep Learning