Font Size: a A A

Research And Implementation On The E-mail Classification System Based On Text Clustering Technology

Posted on:2006-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z S TianFull Text:PDF
GTID:2168360152971336Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of Internet technology, various kinds of network application services welled up. Among them, the E-mail used extensively in the network and it is becoming a kind of swift and economic communication means. The problem about how to manage various emails in our daily life has to be solved urgently.Because the main part of the E-mail was a text, so the classification system of E-mail in the past is mainly based on the technology of text classification. However, this kind of method lasts two insufficient, the first is the classification technology should appoint each classification first, meanwhile, in order to form the accurate results, it all need to carry on a large number samples to study. The second is that the existing classification system of E-mail only considers the text and didn't consider other characteristics of the E-mail.So we put forward the classification system of E-mail based on cluster's technology of the text, This system use the algorithm based on text clustering to classify the E-mail, meanwhile, it also combined some additional properties of the E-mail to improve cluster's result, so two kinds of defects mentioned above can be improved. The paper introduced relevant key technology and method around this kind of cluster's model, and the main content are: Use improved vector space model (VSM) method to express the text in the computer; Use the word strength (WS) technology in the feature selection; Add the additional information of one E-mail while calculating the similarity between these E-mails; Combine two kinds of algorithms in order to adapt to E-mail cluster, etc ... In addition, we have also carried on real experimental analysis on the basis of this theory, the result have proved the exactness of this theory.
Keywords/Search Tags:E-mail classification, text clustering, vector space module (VSM), Hierarchical Method, K-Means
PDF Full Text Request
Related items