Font Size: a A A

Research Of Web Mining And Its Applications In Search Engine

Posted on:2008-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:C W YangFull Text:PDF
GTID:2178360218463584Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The fast-growing Internet is the largest information depository in the world, and it plays an important role in the information spreading.Because of the characteristics of Web, such as Big scale, Dynamic, Isomerous and Half-structured data condition, it is difficult to search information from the Internet.Nowadays, people always use Search Engine to get information and data that they want from the large scale data. But it is very difficult to make users be satistied with the retrieval effect of Search Engine.As the newest research knowledge mining direction, Web Mining is the high level information processing, and it has many affinities with the search engine.The application of Web Mining will benefit the progress of search engine, enhance the information processing capability of Search Engine.This thesis particularly focuses on the analysis and discussion about Web Mining and its applications in Web Search Engine. It specially studys the algorithms of Web Structure Mining,analyzes the disadvantage of the traditional PageRank algorithm.Then improves the PageRank algorithm by using the similarity of Web pages from the Web Cotent Mining domain.The experiment results show that the improved algorithm is effective and expectative.This thesis improves the Topic-Sensitive PageRank algorithm basing the thought above.Then the application in Search Engine of improved algorithm is discussed.Meanwhile, this thesis calculates the Search Engine relative precision ratio of traditional PageRank, Topic-Sensitive PageRank and the improved algorithm.The experiment results show that the improved algorithm can enhance the information processing capability of Search Engine better than the traditional algorithm.
Keywords/Search Tags:Search Engine, Web Mining, PageRank, Topic Sensitive PageRank, Precision Ratio
PDF Full Text Request
Related items