Font Size: a A A

Tree Decomposition For Large Scale Semi-supervised Classification

Posted on:2013-11-02Degree:MasterType:Thesis
Country:ChinaCandidate:H R LvFull Text:PDF
GTID:2248330374975892Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology and skyrocket increasing of Internetinformation, the problems such as text classification, web page classification and imagerecognition have faced challenges in terms of computation and memory requirements. At thesame time, the large scale semi-supervised classification which training mass unlabeled dataand a few labeled data together would have broad application prospects. Therefore, how todeal with large scale semi-supervised classification problems has become a hot topic in theresearch community of pattern recognition, machine learning and data mining, etc.Based on the study of the current state-of-art of semi-supervised classification, thisdissertation investigates the large scale semi-supervised classification algorithms byorganizing samples with clustering feature (CF) trees and using local learning strategy. Itproposes an algorithm framework for large scale semi-supervised classification based on CFtree decomposition and local learning (CFTD-SSC). The method framework applies the CF treeto decompose the unlabeled samples to a series of subsets at first. Then for each subset,CFTD-SSC predicts the labels of the samples by some semi-supervised classificationmethods.Moreover, on this algorithm framework, this dissertation designs several localsemi-supervised classifications including CFTD-S~3VM and three local gragh-based methodsCFTD-LGC, CFTD-GFHF and CFTD-LGT.Experimental results show that CFTD-SSC framework is suitalble for large scalesemi-supervised classification. The kNN-adaptionNN graph which applied on CFTD-LGT issuperior to kNN graph and the label propagation rule that CFTD-LGC applied is better thanthat CFTD-GFHF adopted. Compared to the state-of-art of large semi-supervisedclassification such as PVM and AGR, CFTD-LGC demonstrates the advantages of shorterlearning time while maintaining high classification accuracy.
Keywords/Search Tags:Large scale classification, Clustering feature tree, Semi-supervised supportvector machine, Graph-based method
PDF Full Text Request
Related items