Font Size: a A A

The Cluster Analysis On WEB Text Mining

Posted on:2006-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z ZhangFull Text:PDF
GTID:2168360155959904Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The development of data mining and access technology enlarged the dimensions of database. How to mine valuable information from them had discussed usually, and formed a new subject "Data Mining"(DM). We can use basic statistical data solving and analyzing methods to deal with small data set. But DM usually discussed large data set. Clustering and classing are important parts in DM.Text was the most natural manner in information accessing and exchanging, so text mining was more important in DM. Some valuable database clustering algorithms were not available in text mining. The increasing dimensions, shortage of Random Access Memory (RAM) required algorithms efficiency; the increasing web text quantity, the more new information types required algorithms do with the additional data and clusters easily.It is difficult to do with these problems with basic clustering methods, this paper discusses text clustering with probability theory Bayes methods and information theory, brings forward two clustering algorithms based on a great deal tests. Through data testing and comparing with others, the algorithms show perfect performance.
Keywords/Search Tags:data mining, text mining, clustering, data set, Bayes method
PDF Full Text Request
Related items