Font Size: a A A

Multi-label Classification Via Classifier Chain And Gradient Boosting

Posted on:2021-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2428330614458407Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,multi-label learning has received wide attention in many fields such as text classification,image recognition,and bioinformatics.At present,the main concerns of multi-label learning include label correlation and label complexity,and the classifier chain method is a method that takes into account both label correlation and complexity.The classifier chain is a algorithm of high-order correlation,and it has good flexibility while possessing linear complexity.However,it is easy to cause chain order problems,which leads to the problem of unstable prediction results.Gradient boosting is a traditional single-label machine learning integration method that can automatically focus on misclassified samples and improve classification accuracy.Based on this,this thesis adopts the idea of gradient promotion to improve the classifier chain method,focusing on solving the chain order problem,improving the classification accuracy while maintaining the simplicity and flexibility of the classifier chain algorithm.At the same time,an improved GBCC algorithm based on local label correlation is proposed in combination with label dimensionality reduction algorithm.The main research work of this thesis is as follows:1.In order to solve the insufficient prediction performance and chain order of the classifier chain algorithm,this thesis proposes a gradient boosting method suitable for multi-label.Firstly,A new hierarchical algorithm consisting of gradient boosting and traditional multi-label classifier chain algorithm.It improves the classification accuracy and algorithm stability by establishing the high-level label correlation while solving the chain order problem through the feature transfer and label transfer between the front and back layers.Secondly,pre-pruning strategy consisting of label confidence and an evaluation score can simplify the model and prevent overfit.2.In order to further improve the efficiency of the algorithm,combined with a balanced clustering of label space and a feature selection method based on the local label information gain rate,an improved classifier chain algorithm based on local label correlation is proposed.Firstly,a balanced clustering of the label space is performed on the label set,which divides an overall large task into multiple small tasks,speeds up training,and improves the parallelizable characteristics of the algorithm.Then,on the local label set,feature selection based on the local label information gain rate is performed on the feature space to improve the heterogeneity of the integrated algorithm and further accelerate model training.
Keywords/Search Tags:machine learning, multi-label learning, classifier chain, label correlation, gradient boosting
PDF Full Text Request
Related items