Short Videos Understanding Based On Deep Learning

Posted on:2020-10-11

Degree:Master

Type:Thesis

Country:China

Candidate:X Dong

Full Text:PDF

GTID:2518306464487054

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of the communication technology of network and the widespread use of mobile devices like smart phones and tablets,it has a short video platform represented by Tik-Tok and KuaiShou.Based on the application scenarios of short video,the work mainly focuses on scene recognition,action recognition and joint feature learning.The main work and achievements of the topics are as follows:In order to solve the problem of scene recognition in short video and the problems of blur in short video scenes,a deep fusion network based on VGGNet is proposed.VGGNet16 is used to learn global features,and VGGNet19 is used to learn image details.To solve the blur problems,the blur feature is extracted by using the deep fusion network and the blurred image is up-sampled and the similarity between the blurred image and the clear image is calculated by using the Euclidean distance loss to recreate the operation of removing the blurred image.In the 2017-AI-Challenger-scene-classification dataset,the result of top3 is 92.2%,and the top3 of the Charades short video dataset has achieved 78.9% of the results,which proves that the proposed method has a good effect and in addition,the proposed method has better robustness by recognizing the blurred image.In order to solve the problems of action recognition in short video,this paper first proposes a key frame extraction algorithm based on mutual information entropy,which uses sliding window to preserve the timing information between frames.Based on the key frame extraction,a based on Deform-GoogLeNet,a dual-stream CNN method for variable convolutional networks,uses the dual-stream network to extract the RGB features and optical flow characteristics of the image separately,and uses the weighted average method to obtain the results of behavior recognition.The result of Charades dataset is higher than the similar fusion algorithms,which proves that the proposed algorithm is effective.To further improve the action recognition in short video,a dictionary learning based scene feature and joint feature is proposed.Using dictionary learning and sparse representation methods can help the model find the significant features that enhance the effectiveness of action recognition,where the scene features can be the context information.The results of the experiments on the Charades datasets related to the kitchens indicates that the proposed action recognition algorithm combined with scene information is better than the single action recognition methods,which proves the proposed method.

Keywords/Search Tags:

scene recognition, scene deblurring, action recognition, key frame extraction, joint learning

PDF Full Text Request

Related items

1	Research On The Key Technology Of Deep Learning Based Action Recognition And Tourism Scene Classification
2	Joint Recognition Of Scene And Landmarks Based On Supervised Comparative Learning
3	Cross-Scene Human Action Recognition Based On Wi-Fi Signal
4	Action Recognition Algorithm Based On Spatio-Temporal Scene Graph And Its Application Research
5	Feature Learning Based Campus Scene Recognition And Location
6	Active Learning Based Visual Scene Understanding
7	Research On Video Scene Structure Analysis And Scene Recognition Technique
8	Research On Scene Text Extraction And Recognition Based On Deep Learning
9	Research On Scene Character Recognition Technology In Image
10	Automatic Complex Scene Acquisition And Reconstruction Based On Mobile Robot