Font Size: a A A

The Research On Web Content Mining Technology

Posted on:2004-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2168360095957223Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Web Search engines have become increasingly ineffective as the number of document on the Web have proliferated. Users of web search engines are often forced to shift through the long ordered list of document "snippets" returned by the engines. This paper applied Web content mining to the field of search engine. Search engine results clustering relies on the information returned by the search engine.PAT-tree is a data structure that is widely used in handling Chinese information and word segmentation. This paper applied PAT-tree structure to the Chinese Information Retrieval field and proposed a new Chinese search engine results clustering algorithms based on our modified PAT-tree. Experiment results demonstrate that our approach is feasible and can satisfy the targets we proposed.
Keywords/Search Tags:Web content mining, Clustering, Search engine, PAT-tree
PDF Full Text Request
Related items