Font Size: a A A

Research And Implement Of The Theme Crawler For Automotive Industry

Posted on:2012-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:Z F PengFull Text:PDF
GTID:2218330362956219Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
With the developing of Internet technology and a wide range of network information, the traditional search engines can no longer meet the growing demand for personalized service, so all kinds of topic-based search engine came into being. The theme crawler is the core part of topic-based search engine. To ensure the accuracy and timeliness of the returned query information, study on the theme crawler has important significance. The work in this thesis of designing and implementing the automobile-based theme crawler is on such background.This paper describes the current development of theme crawler firstly, then compares and analyzes various implementation schemes of the current mainstream standard models, web information extraction and web search strategies and other key modules. On the base of the fore-mentioned work, we provide a scheme of theme crawler suitable for automobile and realize all main modules. Finally, we design some performance test for the focused crawler, and make corresponding summaries.Specifically, our studies on the automobile-based theme crawler include the following aspects:1. On the base of analyzing the implementation of mainstream theme crawlers, we proposed a scheme for automobile-based theme crawler and designed the framework.2. After comparing different standard models, we decided to use the vector space model as the one of automobile theme keywords.3. After comparing different web analysis and extraction solutions, we took use of TagWindow tabbed web technology to extract text and web links relevant to the subject.4. After comparing different web search strategies, we decided to use the web-based genetic algorithm search strategy to guide focused crawler to find more topics related resources.5. We tested the performance of each module, analyzed the experimental data so as to demonstrate the advantages of the theme crawler we designed on retrieving information on the automobile industry.
Keywords/Search Tags:Theme Crawler, Vector Space Model, Web Text Analysis, Genetic Algorithm
PDF Full Text Request
Related items