Font Size: a A A

Individual Search Engine Application Based On Industry

Posted on:2009-07-05Degree:MasterType:Thesis
Country:ChinaCandidate:T WangFull Text:PDF
GTID:2178360242489077Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
At present, Most of the current search engines are comprehensive and contain all aspects of the disciplines and industry information, which are very difficult to be full, fast and accurate in reflecting the topic of information. Therefore, my subject will finish the search engine based on industry.The research about the search engine based on industry is divided into five parts, WebCrawler. Crawl on the page all the time from the internet, analyze the links included in the page and crawl into the links to get the pages downloaded in the local machine; Simplier. Analyze the pages which are crawled by WebCrawler, remove the control order and format on the page and remain the content only; Analyzer. Execute the segmentation of the Chinese words based on the remaining contents in order to form keywords; Create index with the keywords by the combination of inverted table and hash table; Search. Execute word segmentation by the user's query, find the page, analyze the results comprehensively and score, and then sort out; Finally, send the latest industry information to the relevant users by Email. In addition, the subject also achieve a dynamic expansion of the vocabulary which you can add single word or a new batch of words.The paper is a combination of industry and search engine. There are two ways to embody the characteristics of the industry, firstly, the initial page choosed by WebCrawler is a centre website about the industry, which can link to many industry-related sites; Secondly, create inverted index by vocabulary about the industry. Taking the pharmaceutical industry as an example, you need a initial page and a vocabulary about the pharmaceutical industry. The innovation point of the paper is the embodiment of personality.users can get interested industry information timely and accurately by Email.The establishment about the search engine based on industry makes the search results more professional and individual.
Keywords/Search Tags:Search Engine, Pharmaceutical Industry, WebCrawler, Inverted index
PDF Full Text Request
Related items