Font Size: a A A

Data Mining Based On Support Vector Machine

Posted on:2005-07-28Degree:MasterType:Thesis
Country:ChinaCandidate:J H ZhengFull Text:PDF
GTID:2168360125462876Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the past years,computer techniques especially of database techniques have developed greatly, area of people's activities has been extended, rhythm of life has speeded up. People are able to get and store data more quickly, easily and cheaply, which make the data and information increase exponentially. Facing the great capacity of data, people are under the pressure of "information explosion" and "data glut". It will be garbage if the massive data can't be exploited. It's the knowledge that has great effect on the development of society. Data mining is a technology that finds underlying rules and extracts valuable knowledge.There are lots of branches in data mining, one of them is classification rules mining. With proper training algorithm on training data, it will generate classifiers that could get prediction to unknown examples. Support vector machine (SVM) is a new classification algorithm based on statistical learning theory. Compared to other classifiers, SVM has better generalization performance and higher prediction accuracy to test example. So SVM has had a lot of application.Na?ve SVM is only able to deal with binary classification. In this thesis, after discussed the current multiclass SVMs, a novel multiclass SVM classifier based on geometric distance is proposed. And the probability output of binary SVM is generalized to multiclass SVM without iteration computing, which improves prediction accuracy with fast computation. The numeric experiment proved that both the above two methods have good generalization, which will increase prediction accuracy to unknown examples.In chapter 1 of this thesis, the history and related theories of data mining are introduced. In chapter 2, the taxonomy, processing models and some popular technologies are discussed. In chapter 3, statistical learning theory and SVM are introduced, and then a new multiclass SVM based on geometric distance is proposed in chapter 4. In chapter 5, the probability output is generalized to multiclass problems. At last some interesting tasks in research are pointed out.
Keywords/Search Tags:Data mining, Statistical Learning Theory, Support Vector Machine, Posterior Probability
PDF Full Text Request
Related items