Font Size: a A A

Research And Application Of Enterprise - Level Meta - Search Engine

Posted on:2013-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y HuFull Text:PDF
GTID:2208330434970462Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Accompanied by the long-term and high-speed development of the global information process, a wide range of information in the form of electronic files has been expanding in people’s storage devices. At the same time, people also put forward higher requirements for their access to information. It’s undoubtedly becoming a problem that how to obtain the information that people want quickly and accurately in a broad array of electronic documents. The birth and development of the search engine, in a certain extent, is able to solve this problem. But the search engine has its limitations, we can not expect a single search engine be able to meet the needs of different scenes and varied queries.The scene is the enterprise search environment of the Group which we are facing to in this paper. That each branch of the Group maintains its own search engine to provide the full-text document search service for itself, while at the same time, there has a search need for the Group to cover all of the branches, so we need to build a Group-oriented enterprise level meta-search engine. There is a big difference from web meta-search engine, that we mainly concern here is sorting algorithm of meta-search in specific enterprise scenario.I have researched the sorting algorithm of classic full-text search engine and meta-search engine extensively and deeply, analysed and summarized the characteristics and applies scenes of various sorting algorithms. Then explored the Lucene documentation scoring mechanism deeply, and proposed a normalization formula for the scenarios of meta-search engine, to remove the local weighting in the Lucene scoring formula which inappropriate for our scene. Finally, combined the classic ideas proposed in the document scoring algorithm, the weighted algorithm and the Hits algorithm, I proposed a hybrid weighted algorithm, which iteratively reweight the score of the document in the meta-search environment, so as to change the correlation score of the document and hence obtain a optimized sort results. And on the basis of the above research, I built the Branch full-text search engine system and Group meta-search engine.
Keywords/Search Tags:Enterprise-class, Meta-search engine, Sorting algorithm, Lucene
PDF Full Text Request
Related items