Research On Ensemble Method To Imbalanced Data Classification With Reinforcement Learning Mechanism

Posted on:2024-08-05

Degree:Master

Type:Thesis

Country:China

Candidate:R P Duan

Full Text:PDF

GTID:2568307052483504

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the advancement of technology,data collection and processing in various industries have become increasingly effortless.Swiftly sifting and extracting the information latent in the data can not only significantly enhance the technical prowess of intelligent data processing in various industries,but also furnish substantial undergirding for the development of correlated industries.With the increasing amount of data,the distribution of data gradually presents a trend of imbalance.Compared with the samples in the data set,the ones with a small number of samples are the focus of in-depth research.For example,in medical detection,spam filtering,bank card fraud prevention and other aspects,the imbalance of data distribution is more common.Effectively solving this problem can timely discover and predict the possible risks,which has important significance of scientific research and application.Existing classifiers often fail to achieve satisfactory recognition rates when dealing with imbalanced datasets.Traditional classification models usually use balanced data sets for training in order to obtain higher classification accuracy,but the effect of this method is often unsatisfactory when dealing with imbalanced data sets.Therefore,this paper first proposes a hybrid sampling method based on boundary information fusion clustering by analyzing the data distribution pattern of imbalanced datasets.This method first defines the concept of boundary points and preserves the boundary points of majority class samples in the dataset accordingly.Then,under-sampling is performed on the remaining majority class samples and the Borderline-SMOTE method is used to over-sample the under-sampled dataset,resulting in the final dataset.By combining this sampling method with traditional classifiers and conducting experiments on multiple public datasets,the results show that it can effectively improve the classification accuracy of imbalanced data.Then,according to the idea of reward function and cumulative reward in reinforcement learning,an improved decision tree algorithm integrating reinforcement learning mechanism is proposed.In view of the problem that a few class samples are easily misclassified in imbalanced data classification,the selection criteria of node splitting attribute of decision tree algorithm is adjusted to make it pay more attention to minority class samples in the splitting process,so as to improve the probability of minority class samples being correctly classified.Through the comparison experiment with the original decision tree algorithm,it is found that the proposed algorithm has improved the recall rate and G-mean index.Finally,the improved decision tree algorithm is used as the base classifier of Adaboost algorithm and combined with the proposed mixed sampling method to obtain the final algorithm.The proposed algorithm is tested on several public data sets,and the experimental results verify the effectiveness of the proposed algorithm.

Keywords/Search Tags:

Imbalanced data set, Boundary information, Mixed sampling, Reinforcement learning, Cumulative reward, Adaboost

PDF Full Text Request

Related items

1	Research On Imbalanced Data Classification Learning Algorithm Based On Mixed Sampling Technique And Adaboost Principle
2	Research On Decision Tree Classification Method Of Imbalanced Data Based On Reinforcement Learning
3	Imbalanced Data Classification Analysis Based On Generative Adversarial Networks And Reinforcement Learning
4	Research On Clustering Algorithm And Its Application Based On Reinforcement Learning
5	Research On Deep Reinforcement Learning Based On Cumulative Error Correction
6	Research On Reward Optimization In Reinforcement Learning
7	Research On Reinforcement Learning Algorithm Based On Parallel Sampling And Behavior Induction
8	Researches Of Robocupâ€™s Local Strategy Based On Multi-Agent Reinforcement Learning
9	A Non-sampling AdaBoost With Information Entropy For Imbalanced Learning
10	Research On Active SLAM Algorithm Based On Meta-reinforcement Learning