Concept Drift Detection Algorithm Based On Multi-label Learning With Label Special Features

Posted on:2021-04-21

Degree:Master

Type:Thesis

Country:China

Candidate:H K Liu

Full Text:PDF

GTID:2428330626965143

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In recent years,with the vigorous development of network technology,a large number of data are generated in the form of data streams.More and more scholars pay attention to the study of data flow.At the same time,under the traditional machine learning classification framework,each instance will be assigned a separate label.However,data in the real world are often assigned to multiple different categories,all of which form a tag set of an instance.If one of the tags is omitted,the information of the instance will be incomplete.In order to deal with the ambiguity of real data,multi-label learning task arises at the historic moment.However,in practical applications,the computational complexity of multi-label learning tasks increases and the classification performance decreases due to a large number of redundant features in the real data environment.An effective solution is to extract features from multi-label data to eliminate redundant features.Among them,multi-label learning algorithms based on label special features complete the task of label feature selection and classification by extracting the correlation between labels.However,these algorithms lack attention to the correlation between instances.In addition,in the real world,a large amount of data is generated every moment,most of which exist in the form of data streams.More and more attention has been paid to the research of multi-label data stream.Around the above problems,the following work is carried out in this paper:1.In view of the lack of existing multi-label learning algorithms and the lack of consideration of case correlation,a classification algorithm is proposed to learn the unique characteristics of labels and case correlation.When constructing the model,not only the correlation of labels but also the correlation of case characteristics is considered.In this paper,the similarity map is constructed to learn the similarity of instance feature space,and the instance similarity information is added to the model training.The experimental results show that the algorithm proposed in this paper can extract the unique features of tags more effectively and has better classification performance.2.To solve the problem that the existing concept drift detection methods are mostly focused on single label data stream,which is difficult to meet the concept drift detection of multi-label data stream,this paper proposes a hierarchical check concept drift detection algorithm for multi-label data stream.The proposed algorithm includes a checking layer and a checking layer.The checking layer judges whether concept drift occurs by detecting the change of data distribution,and the checking layer judges whether concept drift really occurs by judging the change degree of label confusion matrix.Experiments are carried out on 14 data sets such as real multi-label data sets and synthetic multi-label data sets.Compared with the existing methods,the hierarchical verification algorithm proposed in this paper performs better under the indexes of Subset accuracy,Jaccard similarity and F-measure.Experimental results show that the proposed algorithm can effectively detect concept drift and improve classification performance.

Keywords/Search Tags:

Multi-label, Label special features, Concept drift, Data stream

PDF Full Text Request

Related items

1	Research On Multi-label Data Stream Classification Method Based On Kernel Extreme Learning Machine
2	Research On Multi-label Data Stream Semi-supervised Integrated Classification Method Based On Cooperative Training
3	Research On Classification Of Multi-Label Data Streams
4	Research On Class Incremental Learning And Concept Drift Detection In Multi-label Data Streams Classification
5	The Research And Implementation Of User Attribute Streaming Prediction Based On Multi-label Learning
6	Research On Multi-label Data Streaming Classification Algorithm On Very Fast Decision Tree
7	Research On Several Issues Of Multi-Label Feature Representation
8	Multi-label Classification Research Based On Label-specific Features And Label Correlation
9	Malicious DoH Traffic Detection For Concept Drift And Noise Label
10	Research On Multi-label Learning Algorithms Based On Samples And Label Correlations