Font Size: a A A

Analysis Of Searching Results Clustering Based On The Tag Word Extraction

Posted on:2013-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:W J LiuFull Text:PDF
GTID:2248330371466708Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Currently it is in an "information explosion" era, so a variety of search engines came into being. However, due to online information is a semi-structured or unstructured, although the use of a variety of methods to improve the accuracy of search results, but still contains the search results not associated with the query page. And the relevance ranking algorithm still not works. In order to facilitate the users to view interesting web pages, the search engine returns the results clustering, so that users can browse by subject category, to reduce the burden on the user browsing the web.This paper summary the key technologies about Chinese text clustering based on the study of existing algorithms, including text pre-processing, text representation model, feature extraction, feature dimensionality reduction, text similarity calculation, and the existing clustering algorithm. Then analysis support vector regression and its technical implementation.This paper presents a tag-word-extraction-based text clustering method, and to achieve a text clustering system. First, preprocess the pages returned from the search results document preprocessing, including denoising, segmentation, removal of stop words. Then extracted three-meta from the word as a tag word model, and do some word string integration post-processing. Finally, do text clustering to corpus according to the tag words, using the hierarchical clustering method, complete the clustering.
Keywords/Search Tags:clustering, searching results, tag word extraction, SVR
PDF Full Text Request
Related items