Research On Graph-based Semi-supervised Learning Algorithm Based On Binary Similarity Measure

Posted on:2022-03-23

Degree:Master

Type:Thesis

Country:China

Candidate:J Z Miao

Full Text:PDF

GTID:2518306317493964

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

One of the important tasks of data analysis is to make category prediction for samples,which requires enough label data containing the category information to train the learner.However,marking the data requires huge manpower and material resources,which greatly increases the acquisition cost of labeled data.On the contrary,the acquisition of unlabeled data is relatively simple,and a large amount of unlabeled data can be collected through some simple information tools.Unfortunately,using only unlabeled data can lead to imprecise data classification problems.Therefore,semi-supervised learning is proposed to reduce the cost of data acquisition and improve the accuracy of data classification by introducing a large amount of unlabeled data into a small amount of labeled data.Among many semi-supervised learning methods,graph-based semi-supervised learning method is a representative one.Because it can use strict mathematical language to transform the learning task into a convex optimization problem,and then obtain the optimal solution,it has been widely concerned by scholars in recent years,and many effective graph-based semi-supervised learning algorithms have been proposed.These methods divide the learning process into two steps: similarity measurement between samples and label propagation.The research objectives of semi-supervised learning in graphs are mainly focused on two points: one is to accurately measure the similarity between samples to improve the accuracy of label propagation;The second is to effectively reduce the demand of the learning algorithm for labeled data.For these two purposes,most graph-based semisupervised learning has four deficiencies,namely,insufficient use of label information,fixed measurement form of distance between samples,failure to make use of intermediate results,and lack of describing similarity between samples from the perspective of attribute column.Focusing on the deficiencies of these four aspects,this paper uses labels to improve the distance measurement between sample instances and constructs probability dependence relationship between attributes to measure the similarity between different dimensions of data space.Based on these two aspects,it carries out the research of graph semi-supervised learning algorithm based on binary similarity measurement.Specific innovation work is divided into:Firstly,in view of the shortcomings of the above three aspects,this paper proposed the Semi-Supervised Learning Algorithm of Graph Based on Label-Based Metric Learning,which made full use of the small amount of label information in the data and the intermediate results of the label propagation process to update and optimize the measurement method among samples.Based on the local hypothesis in semi-supervised learning,the algorithm uses Mahalanobis distance to measure the similarity of samples,so as to describe the relationship between samples more accurately.Meanwhile,in the process of label propagation,information entropy is introduced to make the algorithm use the intermediate results of label propagation effectively,thus reducing the demand of the learning method for the initial labeled data.Experimental results on six real data sets show that the proposed algorithm achieves higher classification accuracy than three traditional graph-based semi-supervised learning algorithms in more than 95% of cases.Secondly,in view of the similarity measure between the properties of the data space,this paper proposes a relationship between the properties of the probability based on Bayesian Network generation algorithm(BIC-based Node Order Learning for Improving Bayesian Network Structure Learning),first in pairs to find the strongest relationship dependence of nodes to form the undirected connected graph Structure,and then after the V-structure identification to edges in the graph structure directional get base Bayesian Networks,on the basis of the Node topology sequence,to provide inaccessible constraints for the subsequent learning of network structure.The algorithm aims to provide probabilistic similarity information about each dimension of data space for the subsequent label propagation process,so as to effectively improve the learning performance.The simulation experiments on 9 kinds of Bayesian Networks established by experts of different scales verify that the algorithm can identify the probability dependence relationship between attributes with high accuracy.Thirdly,a graph semi-supervised learning algorithm based on label measurement and the probability relationship between attributes is proposed to measure binary similarity from both sample instances and attribute column.Firstly,the stability of the sample is characterized by the method of cluster ensemble.Then,the two similarity measures,instance distance between samples and probability relationship between attributes,were organically weighted and fused.The algorithm obtains complete binary similarity information of data,which provides more accurate similarity information between samples for graph semi-supervised learning,and improves the classification accuracy of the algorithm.By comparing with four semi-supervised learning algorithms on nine real data,it is proved that the proposed algorithm can enhance the classification performance of graph semi-supervised learning.

Keywords/Search Tags:

machine learning, graph-based semi-supervised learning, metric learning, binary similarity, Bayesian networks

PDF Full Text Request

Related items

1	Semi-supervised Metric Learning Based Anchor Graph Hashing For Large Scale Image Retrieval
2	Research On Semi-Supervised Learning Algorithms Based On Bayesian Method
3	Research On The Application Of Geometric Information In The Semi-supervised Learning
4	Graph's Optimal Research In Graph-based Semi-supervised Learning
5	Research On Weakly Supervised Learning Based On Controlled Random Walk Model
6	Graph-based Semi-supervised Learning With Adaptive Similarity Estimation
7	Research And Application On Supervised Similarity Metric Learning Approaches
8	Research On Theory, Algorithms And Application Of Graph-based Semi-supervised Learning
9	Structure Semi-Supervised Learning And Its Application
10	Research On Bayesian Learning Theory And Its Application