Font Size: a A A

Technology Research And System Realization On Cross-Media Data Retrieval Based On Hashing Learning

Posted on:2018-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:T K YanFull Text:PDF
GTID:2348330512486435Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of science and technology,a variety of media types of data have increased the amount of mass,we call it "Big Data".Multimedia data retrieval needs of the users have become diverse.In the past,people can use searching engines such as Baidu,Google or Bing,which could quickly and easily retrieve a large number of multimedia data(such as text,pictures,audio and video),and bring for digital entertainment.But we find that this retrieval method is more simple,it can only retrieve a certain type of media data,such as text only retrieving text,pictures only retrieving pictures.Nowadays,people pay more attention to the different media types to retrieve each other.The analysis and processing of multiple media types of data,can better meet our search needs,so we need to innovate the traditional way of retrieval.Cross-media retrieval is walking into our sight,and people began to extensive research.Approximate Nearest Neighbor(ANN)searching,also known as similarity search to find a searching file with the most similar items out,this term is called the nearest neighborhood.It is found through the study that the approximate nearest neighbor search is very suitable for the retrieval of multimedia data,it can quickly retrieve the media type data that we want.At present,based on the hashing approximation nearest neighbor searching has aroused widespread concern,on the one hand the hashing algorithm using low-dimensional compact binary code to represent high-dimensional feature data,making the data storage having some advantages.On the other hand,the hashing retrieval method is not sensitive to the dimension,making the similarity calculation very fast,which is comfortable to the massive media data retrieval.Because hashing technology has the advantages of low storage cost and fast retrieval,hash-based approximate nearest neighbor searching has attracted wide attention in similar retrieval of multimedia data.In general,most of the data in real-world applications have semantic label information,so many supervised multi-modal hashing learning methods use semantic information to improve the accuracy of the searching.Some of these methods use a similarity matrix to learn the hashing function,but it will lose some useful information of maintaining the original data;Some methods lack robustness to noise and are susceptible to noise in the sample;There is also a way to loosen the hash code in order to circumvent the discrete optimization barrier or to change the hash function and the hash code into two separate processes,which makes the quality of the hash code severely impaired.To consider these problems,in this paper,we propose a novel supervised hashing framework,namely Supervised Robust Discrete Multimodal Hashing(SRDMH),to facilitate cross-modal retrieval.Specifically,in SRDMH,we optimize the binary codes directly instead of relaxing the binary constraints.In addition,in order to make full use of label information,we try to make the binary codes keep original label information.To make it robust to noise and easy for optimization,we introduce a ?2,p(0<p?2)loss and an intermediate representation of binary codes,respectively.The ?2,p(0<p?2)loss has shown to be capable of alleviating sample noise.The benefit of introducing the intermediate representation of binary codes is that we can decompose the difficult discrete optimization problem into two sub optimization problems.Extensive experiments are conducted on three benchmark data sets.The results demonstrate that the proposed method outperforms or is comparable to several state-of-the-art hashing methods for cross modal retrieval.We also design and implement a cross-media retrieval system.The system uses a browser/server mode to provide text retrieval of images and images to retrieve text two cross-media search function.The structure of the system mainly includes the page displaying layer,the business core layer and the data storage layer.The core part of the system uses our proposed multi-modal hash framework for the storage and retrieval of media information.Finally,this system shows the effectiveness and efficiency of our method and provides a reference for practical application.
Keywords/Search Tags:Approximate Nearest Neighbor Search, Learning to Hash, Multi-modal Hashing, Cross-media Retrieval, Discrete Hashing
PDF Full Text Request
Related items