Font Size: a A A

Research On Optimization Of Semi-supervised Classification Algorithm Combining With Active Learning

Posted on:2014-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2248330395498872Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In many traditional approaches to machine learning, a target function is often estimated using either labeled data or unlabeled data alone. However, in many real-world problems it always occurs simultaneously. Semi-supervised learning is an effective approach to settle this problem, as it combines unlabeled data with labeled data to improve a better classification. Moreover, semi-supervised always achieves highly classification accuracy with a little human labor. Recently, graph-based semi-supervised learning has attracted an increasing amount of interest recently and several novel approaches have been proposed.The article analyses the influence of unlabeled data to classification accuracy firstly. It comes to a conclusion that unlabeled data is useful only if the model consumption matches the real data well. Then we introduce the framework of graph-based semi-supervised learning using label propagation.Insufficient labeled data has critical influence to the performance of semi-supervised learning. Active Learning is an effective learning way to release semi-supervised learning algorithm’s dependence on the amount of labeled data. During active learning, the classifier takes inactive to select a set of most informative instances, that is, a point can maximum the classifier performance, than naively selecting the point with maximum label ambiguity. The instances selected by classifier will be labeled by a domain expert, which will then be used as the labeled data set to retrain a new classifier.On this account, we demonstrate a framework to allow a combination of active learning and semi-supervised learning algorithm. Based on Gaussian Random Field and Harmonic Function (GRF) and Local and Global Consistency (LGC) semi-supervised learning approach, we further develop two novel graph-based algorithms AL-GRF and AL-LGC using active learning. Experiments on UCI dataset, such as MNIST, LETTERS, IRIS, have achieved a better performance than conventional learning approach.
Keywords/Search Tags:Graph-based Semi-supervised learning, Active learning, Gaussian randomfield and harmonic function, Local and global consistency
PDF Full Text Request
Related items