Font Size: a A A

Information Retrieval Collection Selection Method Based On Distributed Representation And Local Sorting

Posted on:2017-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:K QianFull Text:PDF
GTID:2278330482481845Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Collection selection is an important part of distributed information retrieval system. Using text semantic information to measure the correlation between query and collection is an effective way to improve the collection selection accuracy. From that perspective, this paper proposes a distributed representation and local ranking based information retrieval collection selection method. In order to solve the problem of inaccurate extraction of document semantics in existing collection selection methods, uses neural network language model to train distributed representation vectors for query and document; considers that the original query is too short to determine user intention, uses a method of combining Wikipedia and ListNet to expand query so as to further improve the accuracy of relevance between query and document; since traditional document ranking method is not suit for finding the most relevant documents for all collections simultaneously, uses local ranking method and document score threshold to overcome it. Finally, chooses ReDDE, MReDDE, CRCS and LBCS as baseline methods. In three collection partition environments, verifies the single and composited effectiveness of those three elements of the method proposed in this paper respectively, and probes into the contribution rate of those three elements. Experimental result shows that the method proposed in this paper can choose better collections, and is superior to the selected baselines in precision.
Keywords/Search Tags:collection selection, distributed information retrieval, distributed representation, query expansion, local ranking
PDF Full Text Request
Related items