Font Size: a A A

Research On Rotation Forest Algorithm For Imbalanced Data Classification Problem

Posted on:2022-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:E H ZhouFull Text:PDF
GTID:2518306557977319Subject:Computer technology
Abstract/Summary:PDF Full Text Request
To solve the real-life classes imbalanced classification problem,especially the classification accuracy problem of minority classes in the two-classes classification problem,scholars have proposed various solutions,such as over-sampling,under-sampling,data synthesis,cost-sensitive learning,integration learning,etc.Among these methods,ensemble learning has received more and more attention due to its good classification performance and advanced generalization ability.Among these methods,the rotation forest is a classifier ensemble algorithm proposed by Rodriguez to use feature transformation and rotation strategy.The algorithm takes decision tree as the base classifier and increases the difference between the base classifiers by acquiring new training sets to obtain better ensemble effect.However,when facing the imbalanced classification problem,the performance of the algorithm is still not as good as expected,although it is better than the traditional machine learning algorithms such as decision tree and random forest.To further improve the classification performance of the algorithm for imbalanced problems,the paper proposes an improvement algorithm and compare it with the mainstream algorithm.At the same time the paper can improve the research on the rotation forest.The main research contents of this paper are as follows.Firstly,it introduces the commonly used methods to solve the imbalance problem at home and abroad,mainly in terms of data and algorithms,and discusses the advantages and disadvantages of these methods as well as the research trends and values.Then,the relevant knowledge and theories are introduced to prepare for the improvement of the algorithm.Secondly,in order to improve the classification accuracy of minority classes in the face of imbalance,the rotation forest is improved.The Hyper-Safe-Level-SMOTE method is used in a new training subset of the rotation forest to obtain the rotation balanced forest(ROBF),in order to improve the classification accuracy of minority classes when facing an imbalanced classification problem.The experiments show that the ROBF improves both the classification accuracy and the minority class classification accuracy compared with the original and traditional algorithms.Thirdly,a new combined algorithm,cost-sensitive rotation balanced forest(CROBF),is proposed by combining the rotation balanced forest with cost-sensitive learning,which achieves the combination of data pre-processing and algorithm layer improvement.After explaining the principle of the algorithm,the experimental comparison with some combination algorithms shows that the new combination algorithm is effective.Finally,the paper improves the deep forest and proposes a deep rotation balanced forest with strong edge scanning(SES-DROBF).By comparing and analyzing with traditional machine learning and deep forest improvement algorithms,the ensemble algorithm is shown to be more competitive with high-dimensional imbalanced data.
Keywords/Search Tags:machine learning, imbalanced data classification, rotation forest, cost-sensitive learning, SMOTE
PDF Full Text Request
Related items