Font Size: a A A

Research On Video Hashing Retrieval Based On Attention Mechanism

Posted on:2021-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y X WangFull Text:PDF
GTID:2428330602983749Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of the Internet and mobile communication technology,all kinds of multimedia data have witnessed an explosive growth,especially video data,which has gradually become an important data form for Internet users to obtain information due to its vivid characteristics.The wide spread of various TV series and movies on video websites and platforms has accelerated this trend.At the same time,the rise of short video platforms also provides convenience for Internet users to quickly upload and download videos.A large amount of video data is generated continuously,which not only enriches the cultural and entertainment life of Internet users,but also brings a series of challenges to the search and recommendation of Internet videos.How to retrieve the desired results from mass video data has become a hot research topic.Content-based video hashing retrieval is a feasible solution to address this problem.The content-based video hashing retrieval method maps the video content information into discrete binary codes,that is,into hash codes,to achieve accurate and fast retrieval of large-scale video data.Since the hash code is discrete,the hamming distance between different videos can be calculated by fast XOR operation,which greatly reduces the computational complexity and storage space.Existing video hashing retrieval methods consider the importance of each frame in the video to be equal.In practical application,however,the importance of frames in a video varies from frame to frame.Therefore,video hashing methods should be designed by considering the importance of different frames in the video.Motivated by this,we propose a method called Attention-based Video Hashing(AVH).The main work of the proposed method is as follows:(1)The thesis proposes a network structure using Convolutional Neural Network and Long Short-Term Memory Neural Network to learn the spatial information and temporal information of each video,which obtains the spatial and temporal features of each video.It lays a foundation for video retrieval based on hash learning.(2)By using the attention mechanism to assign different weights to each frame in the video and capturing the most representative information in the video through the change of weights,the presentation ability of the learned video features is improved.(3)Based on the Siamese network architecture,the proposed method constructs video pairs as network inputs,further designs and optimizes the objective function based on the video pairs,and adds hash code property constraints to the objective function to reduce redundancy information,which enhances the presentation ability of hash codes.By using the Convolutional Neural Network and the Attention-based Long Short-Term Memory Neural Network,the spatial information and temporal information of the video can be learned simultaneously,and a good video-level feature representation can be obtained.Combining with the loss function based on video pair,the video hash code with strong discriminant ability can be obtained.Experimental results on three public datasets show that the proposed method has a significant improvement in video retrieval accuracy and other evaluation metrics compared with existing methods.
Keywords/Search Tags:Video Hashing, Video Retrieval, Deep Learning, Attention Mechanism
PDF Full Text Request
Related items