Font Size: a A A

Research Of Semi-supervised Classification Methods

Posted on:2018-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:C XiFull Text:PDF
GTID:2348330518475155Subject:digital media technology
Abstract/Summary:PDF Full Text Request
In the age of big data,the data size is becoming bigger,the cost of data annotation is also increasing,the benefits of using a large number of unlabeled sample to assisted labeled samples in classifier training are also highlighted.Semi-supervised learning,as one of the methods which can using unlabeled sample,has attracted more and more attention from researchers because it does not need human interaction in study process like active learning.Semi-supervised learning mainly from the use of a large number of unlabeled samples to assist the sample training,combines the methods of supervised learning and unsupervised learning.In general,the supervised learning methods are used to extract the information in the labeled sample,and unsupervised methods are used to explore the knowledge contained in unlabeled data.Semi-supervised learning usually depends on the model assumption,while the model assumption is closing to reality,the performance of semi-supervised learning method can be highlighted,there are clustering assumption and manifold assumption commonly used in semi-supervised learning.Based on these two assumptions,some semi-supervised learning methods are derived.In the classification,the semi-supervised method based on graph used manifold assumption,and the semi-supervised method based on large interval used clustering assumption.Based on some of the current algorithms within mature theory and the latest research results,this paper has carried out the following research work.(1)Summarizes the general characteristics of semi-supervised learning methods,improves the performance of classifiers from taking full use of the knowledge of supervised information such as data lebeled,and effectively learning unlabeled sample knowledge.Combining the knowledge of Manifold Regularization(MR)framework,we propose a semi-supervised classification method based on joint regularization of manifold and pairwise constraint.Based on the MR,a constraint term can be used to effectively utilize the supervised information is introduced.This constraint can transform the data label into pairs on the original basis,which can be further extract the information in labeled data.And the manifold regularization term can keep the local geometric structure of the sample to preserve the advantage of the manifold learning method in the use of unlabeled samples.(2)On the basis of the new proposed modified clustering assumption,which is that the similar individuals should have similar class membership rather than the crisp label assignment,Under the Maximum-Entropy Inference,a quadratic entropy which is similar to the information entropy is introduced and we propose a semi-supervised classification method based on class membership and quadratic entropy.On the one hand,the new method inherits the ability of the modified clustering assumption to fuzzy partition the boundary cross data.On the other hand,the introduction of the maximum entropy principle also overcomes the problem of the probability deviation of in original modified clustering assumption method and guarantees that the algorithm can obtain unbiased probability estimation in the optimization process.
Keywords/Search Tags:Semi-Supervised, Modeling Assumption, Pairwise Constraints, Modified Clustering Assumption, Maximum Entropy
PDF Full Text Request
Related items