Font Size: a A A

Improvements And Researches Of Support Vector Machine Algorithm

Posted on:2019-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:C C GuoFull Text:PDF
GTID:2428330572460748Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The explosive growth of massive data in the information society has advanced the era of big data.Since most traditional data processing tools are designed based on a small number of samples and structured data,it is difficult to meet the requirements of big data processing.As a result,a large number of new data processing models have emerged.Support vector machine is a kind of machine learning method which has a wide range of influence and application.Because this algorithm does not require probability measure and large number law,it avoids the traditional thinking from induction to deduction so that the process of dealing with classification problems can be simplified.In addition,the final decision function of the support vector machine is determined by only a few support vectors,which makes the method effectively avoid the "dimensional disaster" and has excellent generalization ability.However,in the actual application process,support vector machines also have obvious shortcomings.First of all,support vector machines use quadratic programming to solve the support vectors.When the sample size is large,it will consume a lot of machine memory and time.Therefore,it is only suitable for solving small sample classification problems.In addition,since the beginning of the design of support vector machines is to solve the problems of the two classifications,it is necessary to make corresponding improvements to the classifier model when dealing with multilabel classification problems.This paper presents a corresponding improvement method for the problems existing in the current support vector machines,such as the low classification accuracy of processing complex data sets,the “excessive” classification of outliers,and the lack of multilabel classification capabilities.The specific research content is as follows:An improved support vector machine(SVM)algorithm is introduced to introducedistance parameters.This method uses the distance metric instead of the slack variable in the original support vector model,so that the algorithm predictably improves the prediction of a few classes.A multilabel classification algorithm based on the idea of using rough set and fuzzy centralized membership degree and traditional support vector machine is studied.Based on the correlation theory of rough sets and fuzzy sets,the degree of membership of some abnormal samples and their nearest support vectors that affect the classification results are obtained.It also expanded the utilization of the original sample and made the clustering result closer to the essence of the sample set.An improved method proposed for the defects in the classic multilabel classification strategy One-against-all is studied.This method simplifies the decision function of the sample area by mathematical means and merges decision boundaries.According to the principle of assigning samples to the region with the highest degree of membership,the samples in the merged region are reclassified.The experimental results show that the improved method can classify the One-against-all unclassifiable areas.
Keywords/Search Tags:Support Vector Machine, Distance Metric, Clustering, multilabel classification, One-against-all
PDF Full Text Request
Related items