Font Size: a A A

Research And Application Of Vertical Search Engine

Posted on:2009-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:T LiuFull Text:PDF
GTID:2178360245452322Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the high speed of Internet development, the large amounts of information are increasing dramatically. Thus, the blanket search engine will face the more difficult challenges, which are caused by the information collecting,memorizing such aspects. Additionally, the formal large mount of non-target information searched by general searching website cannot satisfy the modern business people, those who need more special and faster searching. Due to the above situation, an urgent need for accurate professional information searching was being developed. Thus, a new vertical search engine, which face to the professional region search engine are just came out. Object to these needs, Cheng Cheng Guang Yuan scientific technology cyber Ltd.co are preparing to create a special vertical search engine, which are provided to the whole business user. That is, "Business Search" website.This paper based on this programme, researched and designed the search engine for "Business Search". This paper mainly analyzed current model and algorithm of Web information searching, further more, this paper also studied some key problems of "Business Search" Vertical Search, which focus on three core modules: Information crawling model,Information classification model,The information indexing model.Information crawling model, which is try to make some improvements objected to the contemporary Web peculiarity, such as too many irrelative feedback, the useful links considerably over concentrated, and appearing grouped together, etc. Especially, the new search engine betters the original Shark-Search algorithm. The detail aspects consist avoiding some irrelative feedback, such as Ads, improved The priority of irrelative with the theme links anchor text in the block of relative webpage, make the relative links to crawl preferentially, and get more relative links.Information classification model established a face to face simple step-form of support vector machine web text classification. The classifier algorithm analyzed the advantage of support vector machine in text classification, and combined the merits of support vector machine dichotomy theory, made a specification among many successively using the multiclass classification. There are many merits of this new classification, including simple model, accurate classifying, and easy to practice, etc.The information indexing model based on the Chinese character of reversal indexing database is proposed. Using the way of positive-sequence index results to establish indexing database, and utilizing improved links-form memorizing structure in this paper, all to cut the costs of refreshing database for server.The last ,according to what are analyzed above, this paper provides systematic general frame design of Vertical Search engine for "Business Search" website.
Keywords/Search Tags:Vertical Search, Shark-Search algorithm, Support Vector Machine, Inverted Order
PDF Full Text Request
Related items