Font Size: a A A

Research And Implementation Of Subject-oriented Mobile Search Engine Based On Lucene

Posted on:2013-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:R ChengFull Text:PDF
GTID:2248330395474637Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and WWW,There are so much informationon the Internet. And people are increasingly dependent on Internet to do research. Inorder to help people search useful information in Internet, general search engine cameinto being and have been developed rapidly.Some general search engine such as baidu,google, became popular. However for professional themes, search engines will beinsufficient, they can not meet specified users’ need. Topic-specific search engine cansolve this problem better.The thesis do a deep research of the key technology of topic-specific search engine,design and implement a topic-specific search engine which is entertainment informationoriented. And it is based on Lucene.The content of the my paper is showed as below:First,My paper describes the concept, background, classify, research status,evaluation index. And so on.Second, in-depth analysis on the key search engine technology:Researches on thesearch strategy and evaluation index for Web-Crawler. Our system is based on Lucenewhich is Java-based. In addition, we have a deep research on Word segmentation whichis very important to Topic-specific search engine.Third, we design and implement entertainment information vertical search engineand there are thee modules in the engine system. they are information collection module,index module and search module.At the information collection module,a arithmetic ofShark Search is adopted to deal with URLs.The HtmlParser is adopted to extractinformation from web pages. At the index module, we improve the NMSEG algorithmfor word segment and involve Regional noun, Merchant Name and so on. At the searchmodule, we sort the search result by weighting.At last,we gave a summary of the work we have done, and pointed out theProblem existed in our system and the improvement in the future.
Keywords/Search Tags:Topic-specific search engine, Topical crawler, Lucene, InformationRetrieval, Chinese Word Segmentation
PDF Full Text Request
Related items