Font Size: a A A

Classification And Application Of Improved Weighted KNN Algorithm Based On SVM To Unbalanced Data

Posted on:2021-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:M X CaiFull Text:PDF
GTID:2428330629480377Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
In recent years,with the explosive growth of various types of data information,the demand for data information processing is also increasing,so the data mining technology in machine learning is gradually favored by more and more people,and it has been widely used in all walks of life.People's analysis of huge data has gradually changed from the previous manual analysis to the use of more intelligent and convenient data mining technology for data classification and integration.Data classification plays an important role in data mining technology.The research on classification methods in data mining is the pursuit of improving classification accuracy.In this paper,the principle of the mainstream algorithm which is widely used in the current classification technology is analyzed in detail.On the basis of the application and popularization of the mainstream classification algorithm in real life,combined with its own advantages and characteristics,this paper selects two kinds of algorithms,support vector machine and k-nearest neighbor,as the research object.Based on the above theory and scheme,an SVM based harmonic weighted KNN algorithm(HWSKNN)is proposed to fit the support vector machine classifier and KNN classifier.The main contents and work of this paper are as follows:1)This paper summarizes the classification algorithms which are widely used at present,and analyzes the classification principle and characteristics of support vector machine and k-nearest neighbor algorithm in detail.According to the current research status of k-nearest neighbor algorithm in the classification of unbalanced data sets and the improved algorithm,an improved weighted KNN algorithm with adjustable factors is proposed,which can attenuate the weight value of a few classes in the classification process of unbalanced sample sets,so that the classification results will not be excessively biased to a few classes,so as to reduce the over fitting of classification results Elephant.2)This paper studies the distinguishing feature of SVM in the process of classification,that is,it has good classification performance when it is far away from the interface,but the classification error mainly concentrates on the surrounding area of the interface.According to the classification characteristics of SVM classifier and the principle of SVM-KNN hybrid classifier,the improved weighted KNN algorithm is introduced into the surrounding area of interface,and the more appropriate classifier is selected to judge the classification by judging the threshold size.The advantages of the KNN classifier introduced can effectively improve the classification accuracy in the area around the interface,so combining the advantages of the two kinds of algorithms,a hybrid algorithm based on SVM and the improved KNN classifier is proposed,which is based on SVM and weighted KNN algorithm,so as to improve the classification performance of SVM and KNN hybrid classifier.3)To verify the accuracy difference between traditional SVM-KNN algorithm and HWSKNN algorithm,the classification results of different types of datasets are tested.This paper classifies the data from the text sample data set and UC Irvine machine learning repository(UCI)data set to verify the effectiveness of the proposed algorithm.From the theoretical analysis and classification experiment results,it can be concluded that the improved algorithm can ensure the classification accuracy under the ideal distribution state,while the classification performance of the unbalanced data set is improved compared with the SVMKNN classifier before the improvement.
Keywords/Search Tags:Machine learning, classification algorithm, support vector machine, KNN algorithm improvement, harmonic factor, SVM-KNN hybrid algorithm
PDF Full Text Request
Related items