Font Size: a A A

Research On Algorithms Of Dictionary And Auxiliary Information Oriented To Support Vector Machines

Posted on:2022-12-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Y CheFull Text:PDF
GTID:1488306779982639Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Support vector machine is a machine learning method based on statistical learning the-ory.It was first proposed by Vapnik and other scholars for binary classification problems in 1995.It is mainly to prevent the model from over-learning and under-learning,and local minima.And other issues.With the in-depth research of researchers,a variety of learning methods based on support vector machines have been proposed,such as fuzzy support vec-tor,least squares support vector machine,robust support vector machine,k-nearest neighbor support vector machine,volume-based support vector machine Product Neural Network Sup-port Vector Machine.And it is widely used in many fields such as image classification,text classification and pattern recognition.Although support vector machines have achieved fruit-ful results in solving binary classification problems,researchers continue to study and extend support vector machines to the study of twin support vector machines,which are similar to standard support vector machines.It constructs two non-parallel classification hyperplanes,transforms a large quadratic programming problem into two small quadratic programming problems,and incorporates the data sample points relative to the target class into the quadratic programming problem into the constraints of the problem,this can greatly speed up the speed of model training.The 21st century is a brand new era,an era of information technology.Data can be col-lected from the Internet anytime and anywhere or extracted using a variety of feature extractors.For example,a text,a picture,an audio,a news,etc.In the field of machine learning,some collected data have additional features,such as annotations and annotations on a text or im-age,these additional features are called privileged information.For data samples with labels,the category to which they belong can be known from the label information.However,at the junction of positive and negative data samples,there is a third type of data sample,which has similar information to positive data samples and similar information to negative data samples,which is called Universum data.Both types of data can provide prior knowledge to assist in training the model.How to make full use of these two types of data to build a unified model and assist training classifiers has become an important challenge:(1)How to introduce privileged information into the dual support vector machine model The empirical knowledge assists the training of the classifier.(2)How to introduce the dictionary learning model into the dual sup-port vector machine,and use dictionary learning to denoise and de-redundancy the input data samples,so that the converted data samples are more sparse.(3)How to embed the Universum data into the dictionary-based transfer learning model,and build two dictionaries in the source domain and the target domain respectively to ensure that the prior knowledge learned in the source domain is transferred to the target domain.In order to effectively utilize these two types of samples and deal with the above related problems,this thesis conducts in-depth research on methods for SVM-oriented dictionary learning,privileged information and Universum data.The main research work of this thesis is as follows:1.To effectively the problem for utilizing the additional information of data samples,the relationship between data samples,this thesis proposes a new method named twin sup-port vector machines with privileged information(TWSVM-PI).The method firstly uses the correction function of the correction space to embed the privileged information into the twin support vector machines to construct a unified model.At the same time,a penalty weight and a slack variable are introduced for the privilege information,and the potential privilege informa-tion of the sample and the original feature of the sample are used to train the classifier model during the training process,that is,the privileged information of different weights is used to adjust the two non-parallel hyperplanes.The plane ensures that samples of the positive class and samples of the negative class are separated by the two non-parallel hyperplanes.In the optimization process,a larger quadratic programming problem is transformed into two smaller quadratic programming problems(QPPs)to optimize the solution to obtain two non-parallel hyperplanes.Finally,we conduct extensive experiments to evaluate the performance of the proposed TWSVM-PI method.The results show that the proposed method can achieve better performance compared to the state-of-the-art methods.2.Aiming at the problems that the collected data samples have noise,data redundancy and data imbalance,and the data samples do not obey the same distribution,this thesis pro-poses a model that combines twin support vector machines and dictionary learning,called a new method based on the twin support vector Machines with dictionary learning(TSVMDL).The method firstly combines twin support vector machines and dictionary learning to build a unified classification model,and trains a discriminative classifier based on twin support vec-tor machines through dictionary learning.At the same time,a binary classification-specific dictionary is constructed for the binary classification problem,and an analysis dictionary is learned to ensure that the data samples can be bridged with sparse coding.In addition,the l2,1norm sparse constraints are used in the dictionary learning model instead of the l0or l1norm sparse constraints to ensure that the coefficients are as sparse as possible in the row.Finally,extensive experiments are performed on different benchmark datasets,and the experimental results verify the feasibility of the model.3.Most of the current research work ignores the problem of unlabeled data samples lo-cated near the junction of positive and negative data samples,namely Universum data.This thesis proposes a dictionary-based transfer learning method with Universum data,namely U-DTL.This method embeds the?sensitive loss function into the transfer learning model based on support vector machine,and uses the prior knowledge provided by the Universum data to assist in training the classification model.Then,the method constructs a dictionary for the source domain and the target domain respectively and uses dictionary learning to enhance the sparsity of the original data and reduce the impact of noise on the transferred data.Further-more,similarity constraints between the dictionaries of the two domains are introduced into the unified model to ensure that the prior knowledge learned in the source domain is transferred to the target domain,assisting the target domain classification.Finally,extensive experiments are performed on multiple benchmark datasets,the results of which validate the effectiveness of the method.
Keywords/Search Tags:support vector machine, twin support vector machines, dictionary learning, Universum data, transfer learning
PDF Full Text Request
Related items