Font Size: a A A

Research And Application Of Multi-view Learning

Posted on:2022-09-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y R LiuFull Text:PDF
GTID:2518306602966069Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Sample classification is one of the important research topics in the field of data mining.Multi-view learning has attracted more and more attention because it can fully mine the structural characteristics of data from different views,so as to obtain better classification results.Moreover,in real life,the number of different categories in different views is not equal,and the number of samples in one category is larger or far larger than that in other categories.This imbalance is bound to lead to the classifier is too much affected by the majority of class samples,making the classification performance of a small number of class samples decline.This paper will focus on the research of improving the speed of model training and unbalanced data learning.Two stage learning method(SVM-2K)is a multi-view learning algorithm using non-smooth hinge loss,but the solution of non-smooth model is more complex.Least squares support vector machine(LSSVM)with smooth least squares loss is widely used in scientific research because of its simple calculation,high speed and high precision.In view of this,in order to improve the training speed of the model,this paper introduces the idea of least squares in SVM-2K.Firstly,LSSVM-2K model is proposed by replacing the hinge loss and insensitivity loss in SVM-2K model with the least squares loss,which replaces the quadratic programming method of classical multi-view learning model by solving linear equations.At the same time,two other least squares loss models,LSSVM-2KI and LSSVM-2KII,are proposed to explore the influence of the replacement of least squares loss on SVM-2K model.The experimental results show that LSSVM-2KI model has more advantages in classification accuracy,and LSSVM-2K model performs well in classification accuracy and calculation speed.LSSVM-2KII model is between the two in classification effect and training time.Although LSSVM-2K algorithm can deal with class balanced multi-view data quickly and well,it may only get a sub-optimal solution when it comes to unbalanced data sets.In this paper,under the cost sensitive learning principle,fuzzy membership is assigned to training samples,and a FLSSVM-2K method with fuzzy weights is proposed.This method can reduce the influence of outliers / noise on the model,and improve the classification accuracy of unbalanced data.However,the fuzzy weighted FLSSVM-2K method still can not solve the classification problem of imbalance data.Therefore,this paper proposes an improved method based on class imbalance learning(CIL): FLSSVMCIL-2K,by giving more weight to the minority classes and increasing the importance of the model to the minority classes.It can solve the problems of class imbalance and outlier / noise at the same time.The experiment uses Gm value instead of precision to evaluate the quality of the model,three different membership determination methods: class center,estimated hyperplane and actual hyperplane;two attenuation functions: linear attenuation function and exponential attenuation function.The experimental results show that the proposed FLSSVM-2K is a very effective classification method for the samples with outliers / noises,and FLSSVMCIL-2K has good classification results for the multi-view imbalanced data sets and the samples with outliers / noises.In addition,the quality of classification results depends on the choice of membership degree.Different samples have unique data structure,only by selecting the appropriate membership can we get good classification results.Finally,multi-view learning can achieve better classification effect by assigning different membership degrees to the data samples of two views.
Keywords/Search Tags:SVM-2K, Least Squares Loss, Hinge Loss, Multi-view Learning, Imbalanced Dataset
PDF Full Text Request
Related items