An Improved Multi-Label Classifier Chain Algorithm Via Label Space Correlation

Posted on:2020-12-08

Degree:Master

Type:Thesis

Country:China

Candidate:Y Yang

Full Text:PDF

GTID:2428330590471693

Subject:Computer Science and Technology

Abstract/Summary:

The traditional Multi-Label Classifier Chain algorithms have some limitations,such as the randomness of the initial label chain sequence,the unstable classification effect,and being not able to effectively deal with large-scale multi-label data sets.In this thesis,An Improved Multi-Label Classifier Chain Algorithm Based on Label Space Correlation(LSCC)is proposed.Combining the advantages of label space dimensionality reduction and LSCC,this thesis proposes a method called Label Space Dimension Reduction Algorithm via LSCC(LSDRCC).This thesis main contents as follows:1.Multi-Label Classifier Chain algorithms assumes that the label at position k is only associated with the first k-1 label.In fact,randomly initialized tag chains do not satisfy that assumption.In this thesis,LSCC is proposed for feature selection and label chain sequence optimization of large-scale multi-label data sets.Firstly,the distance formula is defined,and the label space is partitioned by clustering.The prediction results are obtained by constructing several optimized local classifier chains in parallel by approximate optimal solution.Experiments on 12 multi-label datasets and 3 different types of base classifiers in 5 different domains show that LSCC has better performance in classification accuracy and time-consuming compared with existing algorithms.2.In this thesis,a feature selection method based on local label cluster mutual information is proposed for improve the adaptability of Chain-based multi-label algorithms to large-scale multi-label data sets.The relative mutual information between local labels and features is used to filter out the local feature subset of each label cluster.3.In this thesis,LSDRCC is proposed.It optimizes the label space dimension reduction from Label Coding,Model Training and Hidden Label Decoding.It reduces the time-consuming of classification tasks and improves the adaptability of the improved classifier chain algorithm to large-scale data sets.At the same time,this paper implements the algorithm based on Spark parallel computing framework,which makes full use of the advantages of memory computing.

Keywords/Search Tags:

multi-label classification, classifier chain, label clustering, feature selection, Spark

Related items

1	Research On Multi-label Feature Selection And Classifier Chains Algorithms
2	Research On Multi-label Classification Algorithm Based On Label Relationship
3	Research On The Multi-label Feature Selection And Classification Methods With The Label Correlations
4	Study Of The Classification Method Of Imbalanced Multi-Label Data Based On Label Correlation
5	Research On Multi-label Classification Based On Classifier Chains
6	Based On Decision Relevance Multi-label Classification And Feature Selection Algorithm
7	Feature Selection Method Research For Multi-label Classification
8	Research On Algorithm Of Feature Selection With Fuzzy Discernibility Matrix For Multi-label Classification
9	Parallel Multi-label Classifier Chains Algorithm Using Apache Spark
10	Research On Multi-label Classification Algorithm With Label Correlations