Font Size: a A A

Safe And High Efficient Label Propagation Algorithm

Posted on:2021-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:D M LiangFull Text:PDF
GTID:2428330647951050Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Machine learning has been successful in many real-world tasks.Its success often depends on adequate labeled data.However,in real tasks,it is sometimes difficult to obtain labels for all data,e.g.,when the speed of data generation is much faster than the speed of data labeling.Machine learning frameworks that do not rely on fully labeled data,i.e.,semi-supervised learning,is much closer to real-world tasks,whereas its modeling is also more difficult.Label propagation is one of the mainstream paradigms of semi-supervised learn-ing.Based on the manifold assumption(data belonging to the same pattern have similar labels),the use of inherent graph structure information enables data labels to be effec-tively propagated to unlabeled data.Although much research has been done on label propagation,there are two major bottlenecks.The first is the computational efficiency of label propagation.Label propagation involves matrix operations on the graph struc-ture.The computation and storage overhead is high,which makes it difficult to adapt to large-scale graph data.The second is the performance guarantee of label propagation.Label propagation is sensitive to the selection of the graph structure.Improper selec-tion may cause instability or even serious degradation of performance,which makes it difficult to adapt to practical scenarios with high-reliability requirements.This thesis focuses on these two issues and has made the following progress.1.Aiming at the computational efficiency of label propagation,an efficient and scalable stochastic label propagation algorithm SLP is proposed.The algorithm decomposes the objective function of label propagation into the sum of loss of each poin so that it can utilize the advantages of existing efficient stochastic op-timization algorithms to solve the problem.Theory guarantees the convergence and approximation of SLP.Experimentally,SLP is more than 5 times faster than existing methods in graph structures with millions of points.It is worth mentioning that the method in this paper can flexibly adobt new developments in stochastic optimization algorithms to improve.2.Aiming at the performance guarantee of label propagation,a safe graph construc-tion algorithm Sa Graph is proposed.This algorithm uses only dense regions in the data distribution to construct the graph structure,which can effectively avoid risky unlabeled data and its graph structure.The algorithm uses the idea of ensemble learning and optimize the weights of multiple base graphs.Exper-imental results show that Sa Graph effectively improves the safeness of graph construction and can explain the impact of base graphs on the safe graph.Be-sides,a review on safe semi-supervised learning is completed.
Keywords/Search Tags:Machine Learning, Semi-Supervised Learning, Label Propagation, Computational Efficiency, Performance Guarantee
PDF Full Text Request
Related items