Font Size: a A A

The Information Retrieval Based On Self-taught Hashing

Posted on:2016-12-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2298330467497462Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Today the society has entered the information era, and information has infiltratedin all walks of life, whether it’s industry, agriculture, education, or the governmentdepartments, they all have a deep dependence on information. So the successful andefficient information retrieval can often achieve the perfect performance. On the onehand, the information retrieval pointed out the direction for all kinds of scientificresearch in the whole system of big science, and all kinds of the activities areinseparable from the people for information query.On the other hand, in terms of time,information retrieval make it much convenient for scientific research, because thiswill save a lot of time, thus can avoid a lot of unnecessary trouble.Similarity search problem, which is also regarded as nearest neighbor search,approximate search or approximate items search, it aims to find a item most similar tothe query document, and the item is called the nearest neighbor, it has a certaindistance between the database.Although for document that has been given in advance, some technology thathave recently been proposed can produce high quality code, but for the documentwhich is previously unknown,it is still a challenging problem to get it’s code.Forthe existing methods, they either exist a very high computational complexity, or theyneed to carry out the very strict assumption to data distribution.In this article, we first introduce the related technologies, and the first one is thesimilarity search, and we focus on its importance;Followed by the hash algorithm areintroduced, which respectively from the hash table query at and extremely fastdistance approximation is presented in two;Finally we introduce the current commonhash technology in detailed,including the LSH (local sensitive hash), RMB (stackedboltzmann machine) and so on; Next, we introduces the spectrum of hash algorithm,and the SVM classification, and combining the two, based on the application of theself-taught hash.Among them, the spectrum of hash mainly introduced the spectralrelaxation and outside as well as the sample development two aspects, the SVM ismainly introduces its principle, on the basis of further understanding. In this article, we mainly study this problem, and propose a novel method ofself-taught hashing for the semantic hash algorithm. Self-taught hashing is dividedinto two main learning phase, the two learning phases respectively use the spectralhash and SVM classification method, this paper also gives a detailed introduction, andthe experimental results are given.The end of the article is to summarize, and also to prospect,in the field ofself-taught hash, there are many application of direction wait for us to expand.
Keywords/Search Tags:Similarity search, spectrum hash, self-taught hash, SVM classifier
PDF Full Text Request
Related items