Font Size: a A A

Research Of Near-Duplicate Video Retrieval Based On Hash Learning

Posted on:2019-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:M Z YangFull Text:PDF
GTID:2428330545953705Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet,people's life is more colorful than before,the way that we share interesting things is also gradually diversified.In the early days,we share text on the internet,later we use pictures,and now videos have become our most common ways of sharing on social networks,At the same time,this has also caused the Internet to be flooded with a large number of near-duplicate videos.The massive amount of near-duplicate video has brought many problems,such as bad experience when users searching video,lack of protection of some copyright video,and problems of video recommendations.It also brings new challenges to near-duplicate video retrieval technology.In recent years,the techniques of hash learning have been gradually used to solve large-scale near-duplicate video retrieval problems.Hash learning refers to compressing data into binary code through some algorithms of machine learning,at the same time saving storage and communication overhead.It greatly speeds up the retrieval and is suitable for large-scale video content retrieval.The goal of video hash learning is to represent the video as a binary hash code sequence,so that the video data in the original video database maintains the original proximity relationship as far as possible,which means that the hash code of the original video is as similar as possible,and the non-similar video corresponds to The hash code is as different as possible.Effective hashing results enable us to perform efficient calculations with very limited hardware resources while guaranteeing accuracy.Therefore,the hash learning scheme can effectively solve the problem of near-duplicate video retrieval under age of big data.The hash learning method also has great research value.The retrieval process of near-repeat video using hash learning can be generally divided into three steps:In the first step,key frames are extracted from the original video and features are extracted from key frames(multiple types of features can be extracted).Using the hash learning method,the obtained features are fused and expressed as a vector with real numbers.In the third step,the obtained vector is quantized to obtain the binary hash code as the final representation of a video,and the hash code is used for retrieval.Among them,feature selection and extraction are very important in the first two steps.Good features will play a decisive role in the entire hash learning process.At the same time,the lack of features will also be directly reflected in the search results.Secondly,the process of quantizing the obtained real vector to obtain the hash code in the third step is also very important.The quantization process will involve information loss.The existing method is usually relatively simple to handle this step,and a threshold is directly selected.The values on both sides of the threshold are quantified as 0 and 1,respectively,and such an approach will inevitably result in excessive information loss,which will affect the final search results.Our research is mainly focused on the first step and the second step.Most of the current methods only extract the low-level visual features as the input for hash learning.However,compared with the higher-level feature based methods,the low-level features often lack semantic content,the representation for the original video is often not accurate enough.In response to this problem,in this article,we first extract the intermediate-level deep features and high-level semantic features from a specific convolutional neural network,we also extract two kinds of handcrafted features.In order to combine these high-level features with low-level visual features,we design a hierarchical feature fusion hashing to globally exploit the information derived from different levels of features,which wants to find a common discriminant space for multiple types of features by learning a linear transforms.The experimental results show that the proposed method is more effective than the existing methods and achieves higher retrieval accuracy while using a shorter hash code.
Keywords/Search Tags:Near-duplicate video retrieval, video hashing, hierarchical feature, supervised learning
PDF Full Text Request
Related items