Font Size: a A A

Research Of Multi-calss Text Categorization Method Based On Fuzzy Support Vector Machine

Posted on:2010-07-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y JieFull Text:PDF
GTID:2178360275980504Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Relative to the text classification is characterized with a high number of classes and training examples as well as too many noises, fuzzy support vector machine method has been applied to multi-class text classification. One construction approach of text classifier based on fuzzy support vector machine and decision tree is proposed. The relationship of the sample and its cluster center is considered and combining the tangent sphere constructed by the hyperplane which contains the support vectors and parallels the classification hyperplane in traditional support vector machine, so to further determine the relation of all samples in the class. The membership of one sample to a class can be computed by the location of the sample in the sphere, so the efficient samples, noises and outliers can be distinguished rationally. Integrating the decision tree method, the classification of multi-classes is realized.The main research issue of this paper is the determination of fuzzy membership in fuzzy support vector machine.Relative to the fuzzy membership as a function of distance between the point and its class center for some current fuzzy support vector machines, a new and more effective fuzzy membership as a function of two spheres is proposed for the measurement of the inaccuracy of samples. The relationship of the sample and its cluster center is considered and combining two spheres constructed by the cluster center and the classification hyperplane in traditional support vector machine, so to further determine the relation of all samples in the class. The membership of one sample to a class can be computed by the location of the sample in the spheres.Finally,the tests on Ruters-21578 corpus show that the both methods can efficiently and effectively solve the text classification problems. Compared with the traditional support vector machine methods and the fuzzy support vector machines based on the distance of a sample and its cluster center, the approach based on fuzzy support vector machine and decision tree can distinguish the efficient samples, noises and outliers more effectively,and has preferable classification effect. The fuzzy support vector machine based on the two spheres is more robust than the traditional support vector machine, and fuzzy support vector machines taken by other two fuzzy memberships.
Keywords/Search Tags:text classification, multi-class classification, fuzzy support vector machine, fuzzy memberships, sphere
PDF Full Text Request
Related items