Research On Classification Method Based On SVDD

Posted on:2021-05-23

Degree:Master

Type:Thesis

Country:China

Candidate:C Yang

Full Text:PDF

GTID:2428330626455334

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Classification is a very important task in the field of machine learning.However,in real-life classification tasks,the data of different types may have overlapping parts.Non-separable regions will appear in classification,and such samples are difficult to be correctly classified.Machine learning mainly involves the computer training the model with known data and then uses the model to predict the unknown data.Probabilistic machine learning provides a probabilistic framework for this uncertainty,representing and controlling the uncertainty of models and predictions.Therefore,the research on uncertain task is a very meaningful topic.In addition,some samples are easy to be sampled in real life,while others are difficult to be sampled due to the particularity of their fields.This leads to the situation that some classes in the target data set have many samples while others have few samples.That is the distribution of samples is unbalanced.However,traditional machine learning classification algorithms tend to favor most types of samples when solving such problems,which leads to some problems in classification.For example,in machine fault diagnosis,medical diagnosis and other issues,we need to pay attention to this kind of small but very important samples.If it is misclassified,it may cause very serious consequences.Therefore,it is important to improve the classification performance of a few classes in imbalanced data.In order to solve the above problems,this paper studies the classification method based on SVDD.The main contents include the following two aspects:(1)In view of the uncertainty existing in classification tasks,and the current probabilistic machine learning methods and traditional support vector data description methods face some problems in dealing with this problem,this paper proposes a support vector data description method based on probability.Firstly,the traditional support vector data description method is used to train the two types of data respectively to obtain the data descriptions.And the distance between the centers of the test samples is calculated.Then,a function that converts distance into probability is constructed,and a probability-based support vector data description method is proposed.At the same time,Bagging algorithm is used for ensemble,which further improves the performance of data description.Experiments show that the proposed algorithm has better accuracy and F1 value,and the performance of data description is improved.(2)In this paper,aiming at the imbalanced problem of two common types of data,starting from the algorithm level,a support vector data description method based on optimization is proposed.Firstly,this paper introduces several common optimization algorithms.And then a support vector data description method for understanding the problem of imbalanced data classification is introduced.At the same time,the number information and distribution information of samples are combined to redefine the C value.And several optimization algorithms are used for comparison.Finally,experiments are carried out on five datasets of UCI.The experimental results show that the proposed algorithm has certain advantages under the action of optimization algorithm,among which GA algorithm has a better overall effect.In a word,this paper studies the two problems existing in machine learning classification task by using the support vector data description method.And it is verified on the experimental data set.The research in this paper provides new ideas and methods for machine learning classification tasks.It has certain theoretical and application value in the field of machine learning.

Keywords/Search Tags:

Support vector data description, Probabilistic machine learning, Imbalanced data, Ensemble, Classification

PDF Full Text Request

Related items

1	Research On Ensemble Method Of Structured Support Vector Machine For Imbalanced Data
2	Research On Ensemble Learning
3	Research On Imbalanced Data Classification Methods Based On Ensemble Learning
4	The Research Of Imbalanced Data Classification Algorithm Based On Support Vector Machine
5	Research On Several Problems In Support Vector Machine And Support Vector Domain Description
6	Hyperspectral Image Classification Based On Integrated Learning
7	Research On Classification Algorithms For Imbalanced Dataset
8	A Study On Algorithm For Classification Based On Support Vector Data Description
9	Application Research Of Used-car Recommendation Based On Classification Method On Imbalanced Data Sets
10	Methods Of Multiclass Support Vector Data Description Based On Extreme Learning Machine