As a comprehensive technique of automatic and artificial intelligent data analysis techniques, data mining, which aims at extracting novel and useful knowledge from large volumes of data, has emerged rapidly in last decades, it integrates techniques of machine learning, statistic learning and database. Classification is to predict the class label of data with supervisor obtained from experiential data, which is a basic problem in pattern recognition, machine learning and statistics, as well as in data mining.Support Vector Machine has become one of rising data mining techniques because of its excellent theory (VC dimension and Structural risk minimization and kernel space theory) . Support Vector Machine is a kind of new general learning machine based on statistical learning theory .In order to solve a complicated classification task, it mapped the vectors from input space to feature space in which a linear separating hyperplane is structured. As a structure risk minimized implement, Vector Machine has the advantages of global optimization, simple structure and high practicability.The concepts, key techniques, mining targets, basic mining processes, the prospect, and also some key techniques of data mining is addressed. With emphasis on support vector machine, its theory foundation, basic concepts and crucial techniques of support vector machine, we discuss about its backgrounds and study several generally algorithms about support vector machine further, especially the Platt's SMO (Sequential Minimal Optimization) algorithm. Analyzing the main cause of inefficiency in Platt's Sequential Minimal Optimization (SMO) algorithm which use a single value, we present an improved SMO algorithm employed two threshold parameters. Experiments demonstrate the improved SMO algorithm has a better performance than original SMO algorithm on heart-disease datasets and breast cancer datasets.In the last part of this paper, we propose a prototype data mining system based on support vector machine. This thesis introduces SVM into data mining field, which affords a new choice when designing a data mining system. |