Font Size: a A A

Research On Label Propagation Algorithm For Semi-supervised Classification

Posted on:2022-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y W HouFull Text:PDF
GTID:2480306509970149Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The rapid development of internet technology and the information industry has greatly improved the technology of data collection and acquisition,and the scale of data acquired has also become unprecedentedly large.However,it is obviously very "expensive" to determine label information for all data objects,resulting in a large amount of unlabeled data in the database.To this end,the researchers introduced a semi-supervised learning mechanism to process "partially labeled data." Semi-supervised learning effectively uses a small number of labeled samples and a large number of unlabeled samples in the data set for information acquisition.At present,semi-supervised learning has become an important research direction in the field of data mining.The label propagation algorithm is an important method in semi-supervised learning.This method can use the graph to mine the potential distribution structure of the data,and then spread the label based on the relationship between the samples described by the graph.Since its proposal,it has attracted wide attention from many scholars..Therefore,this paper systematically carried out research work on label propagation algorithms for semi-supervised classification.The main research contents are introduced as follows:(1)Proposed a label ensemble propagation algorithm based on clustering.The algorithm clusters the sample set multiple times.In the clusters generated by each clustering,the complementary entropy is used to measure the confusion degree of the sample labels in the cluster,and the label propagation is carried out in the clusters with less confusion.When the ratio of the number of times a sample has obtained a certain label to the number of clustering times is greater than 0.5,the sample is marked as this label,and clustering and label propagation are iteratively run until all unlabeled samples have obtained labels.Related experimental results further demonstrate the effectiveness of the proposed algorithm.(2)Proposed a positive and negative label propagation algorithm based on smoothing and regularization of anchor points and sample points.The algorithm smooths and regularizes anchor points and sample points at the same time,and propagates positive and negative label information at the same time,so as to realize the classification of unlabeled samples.Related experimental results further demonstrate the effectiveness of the proposed algorithm.This paper analyzes the problems existing in the existing graph-based label propagation algorithms,and designs two label propagation algorithms,which can effectively avoid the problem of difficult composition and improve the classification accuracy of the anchor-based label propagation algorithm,and further enrich the tags.The study of propagation algorithms provides new methods and research ideas for dealing with semi-supervised classification problems.
Keywords/Search Tags:Semi-supervised learning, Clustering, Complementary entropy, Label propagation, Smooth regularization
PDF Full Text Request
Related items