Font Size: a A A

Research On Results Merging Algorithm In Meta Search Engine

Posted on:2017-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z J LiFull Text:PDF
GTID:2348330518970818Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The search engine provides great convenience for users to search the information,but some research shows that the search engine's resources coverage still can't meet the demand of users,and on accuracy,the search engine still needs to improve.Meta-search engine integrates many independent search engines,it uses the independent search engines to complete the retrieval,and in the end,the metasearch engine handles the returned result sets uniformly.To a certain extent,the meta-search engine solves some problems of the search engine,so the meta-search engine has been widely used.At present,the key technology research of the meta search engine mainly includes:retrieval request analysis,transformation of members engine scheduling and synthetic algorithm of search results.In this article,the research mainly focus on the synthetic algorithm of search results also,the research contains duplicated web pages detection algorithm and the sorting algorithms two parts.The duplicated web pages detection algorithm and the sorting algorithms are very important to the meta-search engine,but there still exists some deficiencies of existing algorithms,the research about the deficiencies of existing algorithms is to be done in the paper,the main work is as follows:(1)The meta search engine's system structure and working principle is systematically studied in the paper,and also their research status at home and abroad is analyzed,the key technology and research hot spot of the meta search engine is introduced in detail.(2)In this paper,the commonly used duplicated web pages detection algorithms are comparative analyzed,based on their the advantages and disadvantages,combined with the characteristics of meta search engine results returned,a duplicated web pages detection algorithm is proposed in the paper,the algorithm uses the URL,title,and abstract of the results returned to identify the duplicated web pages,and for the different characteristics of URL,title,and abstract,the algorithm uses different judgment method,So the algorithm is more accurate.(3)In this paper,the advantages and disadvantages of commonly used sorting algorithms are analyzed and studied,the main study is about the Borda vote sorting method,then on the base of Borda sorting method,an improved algorithm is proposed in the paper,the algorithms combines the result's position relations and user's query similarity,also,the algorithms improved the position computing method and user's query similarity computing method.(4)A meta search engine system prototype is put forward,on the base of system,the experiment of duplicated web pages detection algorithm and the sorting algorithms is done,through the analyses of the experiment results,the efficiency of the algorithm is verified in the paper.In the end of the paper,the main work,the innovation points and the process of experiment are summarized,the development direction of meta search engine and future research issues are expounded in the end.
Keywords/Search Tags:meta search engine, duplicated web pages detection, result sorting, query similarity
PDF Full Text Request
Related items