Font Size: a A A

Correlation Algorithm Research And Realization Chinese Text SVM-based Classification

Posted on:2014-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:R ZhangFull Text:PDF
GTID:2268330401473435Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Text information rapidly increases along with the rapid growth of network information. The notion of Support Vector Machine (SVM) has become one of the most important topics among the numerous classification algorithms. In this dissertation, the research topic is focused on the technology of automatic classification and study how to obtain useful information among the miscellaneous text information.By focusing on SVM, this dissertation will probe into relevant algorithms of Chinese text classification; furthermore, a Chinese version of SVM will be put forward on the basis of these relevant algorithms.By analyzing the feature weighting of TFIDF in details especially during the process of weighting, there will be relevant improvements put forward to two distinctive drawbacks-word frequency of feature words and the overall distributed situation of training set, including the process of weighting in eigenvalue.In light of the deficiencies such as the large number of training sets, speed of training and slow speed of classification, by carrying out in-depth study of SVM classification algorithms, I present a method to reduce the numbers of texts during the process of classification; in this way, the speed of training SVM can be accelerated. Moreover, by adopting the method of OPTICS density clustering in order to extract original samples which play a crucial role in classification, new training set is established. The last but not least, by comparing performance index such as accuracy rate and recall ratio, an evaluation of classification results is given. The experiment proved that the classifier has good classification results, and have great value in use.
Keywords/Search Tags:Text Categorization, TFIDF, Density Clustering, Support Vector Machine (SVM)
PDF Full Text Request
Related items