Font Size: a A A

The Research, Based On A Topic Of Interest In Push Reptiles

Posted on:2013-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:L Z YangFull Text:PDF
GTID:2218330374465170Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The Internet development is rapid, the number of information on it has a huge growth.The limitations of general web crawler showed subsequently, their recall and precision cannot satisfy the user's need, especially in some special areas,the search results are not satisfactory, In order to solve this problem, researchers invent the theme web crawler based on general crawler, it improved general web crawler's precision.This paper analysis detailed PageRank algorithm of Google, and improve it based on link structure and webpage content topic correlation.and an improvd algorithm combining webpage links and text content of theme crawler is propose. The improved algorithm search results rank subject relevance, improves the precision of general web crawler.It sort the search results by subject relevance ranking improves the precision of general web crawler,and to slove this question that the theme crawler every time has to click on the many relevant webpage in order to find interesting content. This paper presents a theme crawler based on pushing interest, it detects user's interests through the user's click behavior, and combins it with the topic relevance. Such that when the users use search engines who can not only find related webpage in the huge Internet information, also through the analysis of personal interests, push personal interested webpage in search results list.This paper take Eclipse as the development environment, development of a set of digital product search engine system to verify the improved algorithm, we apply the improved algorithm in this system.Through the test,we proved the scheme is practical and feasible.it improved the use efficiency and accuracy of search engine querying, and to a major extent for the user to provide a more convenient search service.
Keywords/Search Tags:digital products, topical relevance prediction, interest push, theme crawler, topicdrift
PDF Full Text Request
Related items