Research On Efficient And Robust SVM Based On CRF

Posted on:2022-07-01

Degree:Master

Type:Thesis

Country:China

Candidate:Q Y He

Full Text:PDF

GTID:2518306575967089

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In the field of machine learning,the noise in the original training set is generally divided into attribute noise and label noise.In most cases,the harm of label noise is greater than attribute noise,and it will seriously affect the accuracy of classifier verification.In order to eliminate the negative effects of tag noise,it is mainly based on filter or robust algorithm to filter tag noise.According to this,someone proposed a tag noise filtering learning framework based on completely random forest(CRF-NFL),which uses completely random forest(CRF)as filter.CRF-NFL framework can not only filter tag noise effectively,but also combine various classifiers to train the filtered training set,that is,it can combine other robust algorithms to filter tag noise and further improve the filtering performance.However,this framework has two disadvantages: one is that the complete random forest has not been optimized,which can not make the accuracy of classifier verification higher;the other is that it only focuses on the combination of various classifiers.For example,when the classical support vector machine(SVM)is selected as the combined classifier,the crf-nfl-svm model is formed.In the dichotomy problem,the robustness of SVM is not considered,and in the high noise training set In this case,the performance of crf-nfl-svm model is not ideal.In view of the two shortcomings of CRF-NFLframework,this thesis makes related research based on crf-nfl framework and support vector machine theoryFirstly,this thesis optimizes the label noise filtering method based on completely random forest.Through the optimization of the voting threshold,the label noise in the original training set can be filtered better,and the verification accuracy of the classifier is higher.At the same time,because there is no pruning process between completely random forest and random forest,support vector machine does not need cross validation,and the efficiency is also improvedSecondly,this thesis proposes a method to improve the robustness of SVM algorithm in dichotomy problem.Because label noise is an important factor to make the original data set indivisible,the key to transform the linear non separable problem into linear separable problem theoretically is to maximize the penalty coefficient,according to which the maximum hyperplane can be solved according to the linear separability,thus increasing the anti noise ability of the algorithm and improving the robustness of SVM.Finally,according to the above optimization improvement,this thesis proposes an efficient robust support vector machine model based on completely random forest(CRF-ERSVM).Using UCI data set,compared with the classical support vector machine model and CRF-NFL-SVM model,the verification accuracy of this model is improved by5.18% and 4.18% respectively in the noise data set with 20%.

Keywords/Search Tags:

completely random forest, label noise, voting optimization, support Vector Machines, robustness

PDF Full Text Request

Related items

1	Research On Label Noise Based On Ensemble Learning
2	Label Noise Cleaning Using Support Vector Machines
3	The Performance Optimization And Application For The Classifier On Classification Noise Detection
4	The Improvement Of The Voting Method For Multi-class SVM Classification
5	Research On Some Issues In Support Vector Machines
6	A Fast Multi-label Classification Algorithm Based On Binary And Triple Class Support Vector Machines
7	Research On Classification Method Of Random Support Vector Machine And Its Application
8	Application Of Random Forest In Microfinance
9	Studies Of Some Problems In Support Vector Machines And Semi-supervised Learning
10	Research On Twin Support Vector Machines And Its Optimization Methods