Font Size: a A A

Research On Video And Text Storage For Content Query

Posted on:2022-09-09Degree:MasterType:Thesis
Country:ChinaCandidate:S Y FengFull Text:PDF
GTID:2518306572991049Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
The rapid development of Internet technology and popularization,has triggered the exponential growth of text,images,audio and video mode data,and brought great challenges to data storage and retrieval.The main characteristic of multi modal data contains its great variety and storng hybridity,which makes it difficult to effectively manage and utilize these data,and also further prevent mining potential value from these data.Therefore,this paper studies a video-to-text storage retrieval system based on semantic content similarity of data through deep learning,storage optimization and other technologies.The main challenges are: the huge difference between video and text data and the semantic gap,which makes relevant and effective queries more difficult.As a result,it is difficult for traditional relational database systems to effectively manage cross-modal data such as video and text,and also fail to establish the correlation between cross-modal data based on semantic content similarity.In this paper,a video and text cross-modal query model VTCRH is constructed by deep neural network,which extracts the semantic features of video and text respectively,and realizes the task of cross-modal storage and query of video text.To be specific,firstly,the text and video data are input into the hash network in the offline state,and the corresponding semantic hash codes are obtained through the processing of feature extraction network and hash network.Secondly,the semantic hash code is input into the Neo4 j graph database,and the hash code is taken as the node and the hamming distance between the hash codes serves as the side length to construct the hash graph,which is convenient for fast hash retrieval.Finally,in the online stage,the input text or video data is received and processed in real time to obtain the corresponding hash code,and the hash graph is retrieved according to the retrieval radius,and the retrieval target file is returned from the underlying storage system,so as to realize the cross-modal text video storage retrieval task based on semantic content similarity.The VTCRH model proposed in this paper has been tested on three public dataset,including MSR-VTT,MSVD and TGIF.The test results show that the model is compatible with storage systems and can stably extract semantic content features of video and text data at a small resource cost,thus completing the task of cross-modal retrieval.Through the optimization design of the hash graph in the Neo4 j database,the storage cost of data and communication cost of nodes are reduced,and the query efficiency of the system is improved.
Keywords/Search Tags:Content retrieval, Content similarity, Hash map, Video and text
PDF Full Text Request
Related items