Font Size: a A A

Design And Implementation Of Distributed Similar Video Retrieval System

Posted on:2022-09-09Degree:MasterType:Thesis
Country:ChinaCandidate:F R YanFull Text:PDF
GTID:2518306722472944Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the popularity of smart phones and 5G networks.Kwai,Tiktok,Iqiyi and other video applications have everfount into the public life.The amount of user-made videos has increased rapidly,and pirated videos are emerging one after another.Some users steal other people's original videos and submit them to other websites to obtain benefits through extremely low-cost means such as copying and downloading.This behavior causes serious damage to the interests of the original authors.Original videos often need to pay a lot of human and material resources,but they are easily captured by others,which is not conducive to the development of the industry and the protection of copyright.In consideration of copyright protection and risk avoidance,video websites and apps need to review video submissions.However,large video websites have hundreds of thousands to millions of video contributions every day.Viewing each contribution video through manual audit to determine whether it is an original video requires a large labor cost,and the audit efficiency is slow.Therefore,we hope to establish a retrieval library for the company's existing video manuscripts through technical means,carry out similar video retrieval for new video manuscripts in the library,filter out the videos suspected of handling theft and send them to the manual audit channel for review.For the detection of similar videos,scholars have proposed the method of extracting global features and local features through video key frames to generate video features,and use tree structure or local sensitive hash to build indexes for feature matching to retrieve similar videos.In recent years,convolutional neural network has made good achievements in image content recognition.The continuous development of vector retrieval technology makes it possible to build high-performance feature vector index.This paper proposes a scheme of feature extraction using CNN model and constructing feature vector index using faiss engine to provide efficient similar video retrieval.The system is mainly divided into the following five modules: task scheduling module,feature extraction module,feature index module,feature storage module and external retrieval service module.The task scheduling module is the entrance for the system to process video manuscripts,which is responsible for the information processing of video manuscripts and the calling process of other modules.The main task of the feature extraction module is to extract video key frames and call CNN model for feature vector extraction.The feature index module is responsible for building feature vector index and providing similar video retrieval services using faiss engine.The feature storage module is responsible for dropping and uploading video features,and permanent storage of HDFS.The external retrieval service module is the general entrance of external similar video retrieval requests.It serves the query cache cluster and access index cluster to obtain similar video retrieval results and feed them back to the upstream service caller.The system uses distributed deployment,and RPC framework is used for information exchange between modules to improve the performance and efficiency of the system.After testing and online operation,the similar video retrieval system processes50 W of new video manuscripts every day and filters out about 1.2W of duplicate videos,realizing the improvement of copyright protection and video audit efficiency.In terms of performance,the time from new video manuscripts entering the system to generating similar video detection results is less than 1min,the scale of feature index library is about 250 million video manuscripts,and the average time of single retrieval is about 30 ms.It meets the business requirements and performance requirements as a whole and operates stably after going online.
Keywords/Search Tags:Similar video retrieval, Video duplication check, Vector retrieval, Copyright protection
PDF Full Text Request
Related items