Study Of Text Clustering Algorithm Based On Semantics

Posted on:2013-10-29

Degree:Master

Type:Thesis

Country:China

Candidate:Z X Guo

Full Text:PDF

GTID:2248330395955348

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the rapid development of the Internet today, people are in an "informationexplosion" era. Currently there are vast amounts of semi-structured or unstructuredinformation, how fast and efficient mining of useful information for people, is aproblem which lots of scholars are working on it. Text document clustering is a methodof automatic classification, which does not require training. Currently most clusteringalgorithms do not have a high speed and accuracy.Firstly, for the above problem, we propose a graph-based structure of the textrepresentation model-WSCG (Weighted Subject Conceptual Graph), which divides thedocument concepts into centroid concepts and peripheral concepts bases on theirsemantic relations to the subject, and the semantic similarity between two documents iscalculated by centroid concepts and peripheral concepts respectively. Secondly, basedon the existing study of the clustering algorithm, to make the relation calculationbetween two documents more accurate during the clustering process, we design a textclustering algorithm based on WCSG. Finally, based on the study, a text clusteringsystemâ€“SemCluster, is implemented in C++.Experiments show that the representation based WCSG text in the document textsimilarity calculations and clustering has higher accuracy than existing methods, whilethe text clustering system has been tested, proved the system met the designrequirements.

Keywords/Search Tags:

Text Clustering, Semantic Similarity, WSCG, Fuzzy Clustering

PDF Full Text Request

Related items

1	Research Of Web Text Clustering Based On Semantic
2	Research On Text Clustering Based On Semantic Similarity
3	Research On Thesis Text Clustering Based On Semantic Similarity
4	Research On Text Clustering Algorithm Based On Word Frequency And Semantic
5	Research On Key Problems About Large-Scale Text Clustering
6	Search Of Group Intelligent Text Clustering Methods Based On Semantic Similarity
7	Clustering Algorithm Research Of Short Text Based On Semantic Similarity
8	Research On Document Clustering Based On Semantic Similarity Of Hownet
9	Study On The Chinese Text Clustering Algorithm Based On Semantic Similarity
10	The Study And Application Of New Clustering Algorithms In Image Processing And Text Clustering