Researches On Multi-Classification SVM Algorithm Based On Data Relationship

Posted on:2014-02-28

Degree:Master

Type:Thesis

Country:China

Candidate:Z Liang

Full Text:PDF

GTID:2268330401962382

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

How to separate large-scale multi-class data effective and high-precision image is an important problem in the Date Mining field, the relationship between the pending data extraction is the key in this classification issue. Extraction of data relationships is to find out implicit relationships (including attributes, characteristics, boundary, etc.) in categorical data, separate Different categories of data classification by training a classifier (group), the merits of the extracted data relationships will affect the classification results. In real life and scientific research, the multi-classification problem increasing attention, made a lot of efficient multi-classification algorithms. Multi-classification method based on support vector machine is to combine SVM method with multi-classification method, optimize the extraction of data between data samples and training a classifier combination. Multi-Classification SVM Algorithm Based on Data Relationship divided on a variety of samples and derive the redundant information In order to optimize the classifier group, improve the accuracy of classification. The main works are concluded as follows:(1) Analyze and summarize existing multi-classification SVM method, point out the main problems of these multi-classification SVM algorithms, and exported and researched to address these issues.(2) Summary imbalanced data classification issue, point out the strengths and weaknesses of the existing imbalance data classification method, and give improvement strategies for these deficiencies.(3) Put forward a method for the balanced multi-class data SVM multi-classification method based on vector product--DR-SVM method, and made a preliminary discussion and study on the new method. The new method pretreatment data using vector inner product approach, then abandon the redundant information, marked the SVM training sample effectively, optimization the classifier group model to improve the efficiency of classification. (4) Proposed a Multi-Class unbalanced data SVM method based on space spread—SS-SVM. SS-SVM method uses a method based on the spatial extension to treat the classified data to increase the number of small class sample, reducing the classification unbalance degree, to enhance the purpose of the classification efficiency of the small class samples.(5) For those problem such as "small-block" problem,"Redundant classification" problem," absolute imbalance" problem, paper presents the improvement and optimization methods based on DR-SVM and SS-SVM. Reference and use the series of policy principles, and applies those principles to a specific on category divided issue.The multi-classification problem is a very popular research direction in data mining filed. And the imbalance classification problem is now very important research focus in the era of large-scale data. The article is not only enrich the theory and application of support vector machine, but also broaden the imbalance and the multi-classification problem solving ideas, possess Important theoretical significance and practical value.

Keywords/Search Tags:

Data relationship, Space expansion, Imbalanced data, Multi-classification

PDF Full Text Request

Related items

1	Research And Application Of Imbalanced Data Classification
2	The Research Of Imbalanced Data Classification
3	The Algorithm Research Of Associative Classification And Classification Based On Imbalanced Data
4	Research Of Multi-class Imbalanced Data Classification Method
5	Research On The Classification Algorithm Of Imbalanced Data Sets
6	Research On Methods For Imbalanced Data Classification
7	Research On Classification Method For Imbalanced Data Sets And Its Application
8	Research On Application Of Classification Algorithms For Imbalanced Data
9	Research And Application Of Imbalanced Data Classification Algorithm
10	Research On Classification Methods For Large-scale Imbalanced Data