Font Size: a A A

Research On Hierarchical Multi-class Classification Algorithm Based On OvO Decomposition

Posted on:2019-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:J J RenFull Text:PDF
GTID:2428330545981748Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Classification,the basis for the solution of many data-mining-related problems such as financial instrument and clinical diagnosis etc.,is an indispensable part in the research of the field.Although in reality the many problems concerning multiple classes can be considered under the general category of classification,binary classification algorithm is more feasible in implementation considering the degree of complication in the algorithm design.Therefore,binary classification algorithm is adopted to solve the many problems in both scientific research and the process of production.This adoption of binary classification algorithm not only expands the algorithm's area of application,but also provides more probabilities in the choice of algorithm in a specific case,and is therefore widely noticed.As the method of one vs.one(Ov O)and that of aggregation are the two most frequently-adopted ones in classification decomposed and hierarchical structure respectively,the paper makes an all-around comparative analysis of the three major hierarchical structure aggregation methods,including Decision Directed Acyclic Graphs(DDAG),Directed Binary Trees(DBT),and Adaptive Directed Acyclic Graphs(ADAG)on the basis of Ov O splitting.Furthermore,the paper proposes an improved DBT classification algorithm and a new node measurement standard.The main contents are as follows:(1)The paper makes a comparative analysis of the above-mentioned models from structural characteristics,classification accuracy and running time and arrives at the following findings: DBT has the advantages of flexibility in structure and high classification accuracy,especially under relatively small-scale data categories;while in the case of large numbers of data categories,the training time for DBT is relatively long.In the case of DDAG,the running time is the shortest,but its accuracy is the lowest due to strong node deployment mechanism.And the classification accuracy and running time of ADAG are relatively mediocre,but its model structure is less controllable.Finally,the above analysis results are verified by the experiment on the UCI standard data set.(2)In the process of DBT model construction,choosing the wrong binary classifiers leads to low accuracy in model classification.The paper proposes an improved DBT algorithm based on unbalanced factors.Taking into consideration the influence of sample balance and class margin on the performance of binary classifiers and using the unbalanced factors to weigh the proportion of the two in order to cope with the uneven distribution of samples,a better measurement standard for binary classifier is constructed.Experimental testing shows that by using the measurement standard of unbalance factors,a DBT model with lower error rate and better classification can be constructed.(3)In view of the poor classification result caused by the inappropriate choice of binary classifier's standard in hierarchical multi-class classification algorithms,the paper proposes a new method of measuring binary classifier based on Hellinger distance matrix.Being easy to calculate and insusceptible to uneven distribution of data,Hellinger distance can be adopted in the measurement of the overlapping of different categories of data.Under Hellinger distance matrix,the degree of data category differentiation increases as their overlapping declines,so a hierarchical model with better classification performance can be constructed.The experimental results verify the effectiveness and feasibility of the standard.
Keywords/Search Tags:Multi-class Classification, OvO Decomposition, Directed Binary Tree, Decision Directed Acyclic Graphs, Adaptive Directed Acyclic Graphs
PDF Full Text Request
Related items