Font Size: a A A

Research And Implementation Of Vertical Search Engine For Journal Papers And Monographs

Posted on:2017-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:B W ZhengFull Text:PDF
GTID:2348330488959075Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The progress of information technology has caused the massive growth of data. The huge amount of information data is distributed on every server in the world, and the information data is rapidly growing and updated. Vast amounts of information and data for Internet users to provide rich information at the same time, also makes it difficult to extract the information they need, to obtain accurate information is like looking for a needle in a haystack. Although there are Bing, Google, Baidu and other outstanding search engines, but due to the huge amount of data on the Internet, search engines do not collect all the information pages,. not timely maintenance and update can not effectively understand the user's query needs, the accuracy of the query returns the results is not high, timeliness is not strong, high repetition rate. Therefore, the user to search engine technology proposed a new demand, search engine system must be able to provide users with more timely, more detailed, more accurate and deeper information, so the vertical search engine technology came into being.This paper studies and implements a vertical search engine, which is based on the research results of specific basic theory. The basic principle of the research, the basic principle of the vertical search engine, the core technology and the process of web crawling, design and implementation of the information collection module, crawler crawling module, information indexing module, query results retrieval module, user interaction module. In which the web page analysis module to achieve the function of the URL elimination, compared with two kinds of web page denoising scheme, the merit based on the function, and completed the web page to re work. At last, the existing problems and the future direction of the research are discussed.
Keywords/Search Tags:Theoretical Research Results, Vertical Search Engine, Web Crawler, Lucene, Web Page Denoising
PDF Full Text Request
Related items