The Research Of Text Clustering In Personalized Information Retrieval System

Posted on:2011-04-16

Degree:Master

Type:Thesis

Country:China

Candidate:J Zhang

Full Text:PDF

GTID:2178360305989383

Subject:Computer application technology

Abstract/Summary:

With the developing and dissemination of the internet,"Retrieval"has become a part of daily live. Internet joins all over the world together, but how can we find what we need? The answer is retrieval.Literature retrieval is most usefulness for researchers in many retrieval systems. But now most retrieval systems can only have retrieval technique on matching of keywords, but it can't get the interests of the users. If the system can get them, it will be convenient for the users, because it can put the interested literatures in the head. Our teams have started to design a system which can get the interests by the behaviors of the users, and compose the user whose interests are similar to a user group. So they can exchange and sharing if resources. The paper discusses the basic part of the retrieval system which our team designs, my work contains text processing, clustering. I complete the process which can convert the words to vectors. It can control the stop word list, generate vector. I also improve the AP cluster.Affinity propagation (AP) clustering has one advantage: if you don't know the number of clusters, you have no use for specifying the number of clusters. Sometimes, we know the number of clusters, how can we use this to improve quality of AP clustering results. This paper proposes an improved AP method to deal with such circs. In comparison to AP, the improved AP has better performance on the data sets whose clusters number we have known. Experimental results show that the improved AP is effective and its quality of results is better than or equal to that of AP clustering.

Keywords/Search Tags:

Information Retrieval, Feature Extraction, Text clustering, Improved AP clustering

Related items

1	Research On Key Problems In Text Mining
2	Research On Patent Text Clustering Based On Improved K-means Algorithm
3	Significant Study Of Text Clustering Model Based On Machine Learning
4	A Research Of Image Retrieval Based On Improved Clustering Algorithm
5	Precise Clustering Algorithm For Chinese Text Based On K-means
6	Research On Web Text Clustering And Retrieval Technology
7	The Method Of Fine-Grained Topic Information Extraction And Text Clustering Based On Chinese Phrase
8	Research On Clustering Algorithm Of K-medoids And Its Application In Text Clustering
9	The Study Of Feature Extraction And Clustering On Chinese Websites Product Reviews Based On The Improved Pruning Algorithm
10	Research On Text Structural Information Extraction And Clustering Based On XML