Font Size: a A A

Application Of Support Vector Machines In Data Mining

Posted on:2003-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:X G DongFull Text:PDF
GTID:2168360092966432Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data Mining is an application technology which involves many disciplines such as statistics and machine learning. At present, most of data mining algorithm is improved on some method in statistics and machine learning field. Classification is to predict the class label of unknown data with supervisor obtained from experiential data , which is a basic problem in pattern recognitionx machine learning and statistics, as well as in data mining.Classification can be regarded as a learning problem from experiential data. Unlike approach theory in orthodox statistics, statistical learning theory especially studies the law of machine learning when samples are finite. It has proved the bound of actual risk is made up of experiential risk and belief bound .VC dimension is used to control generation ability; Structural risk minimization induce principle is used to control the bound on the value of achieved risk by controlling experiential risk and belief bound at the same time . Support Vector Machine is a kind of new general learning machine based on statistical learning theory . In order to solve a complicated classification task, it mapped the vectors from input space to feature space in which a linear separating hyperplane is structured. The margin is the distance between the hyperplane and a hyperplane through the closest points . the separating plane with maximal margin is the optimal separating hyperplane which has good generation ability. To find a optimal separating hyperplane leads to a quadratic programming problem which is a special optimization problem. After optimization all vectors are evaluated a weight .The vector whose weight is not zero is called support vector. The separating hyperplane structured by supportvectors.The larger data set in real world demands higher efficiency. Decomposition is the first practical method to deal with larger data set. It decomposes the training set to two parts: active and inactive, the active part is called working set. Only variables in working set can be updated in current iteration. In every iteration only a sub-optimization is solved. Decomposition and optimization technology based on feasible direction provide a feasible method to solve the Support vector machines" training problem.In this paper, the effect of SVM in data mining system adjusting pump parameters is compared with BP algorithm's. The result shows SVM is better than BP algorithm in some aspect. SVM is introduced into data mining field in this paper, which can attract more attention of researchers in data mining field and affords a new choice when designing a data mining system.
Keywords/Search Tags:support vector machines, statistical learning, data mining classification, decomposition
PDF Full Text Request
Related items