Font Size: a A A

Coordinate Descent Method For Semi-supervised Learning And Application To Document Classification

Posted on:2011-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:X LiFull Text:PDF
GTID:2178360308464398Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid development of computer and network, a lot of data springup. It becomes common need in extracting useful information from these data. But how to getthe labeled data is time consuming and laborious.Semi-Supervised Support Vector Machines(S3V M) are based on applying the marginmaximization principle to both labeled and unlabeled examples. The first widely used im-plementation was by Joachims. Since then, various non-convex optimization techniques havebeen proposed to solve S3V M. their formulation leads to a non-convex optimization problem.We use entropy regularization to instead which turns formulation into convex optimization.Coordinate descent is a common unconstrained optimization technique, but its use forS3V M has not been exploited much. In this paper, we propose a novel coordinate descentalgorithm for training S3V M.At each step, the proposed method minimizes a one-variable sub-problem while fixing other variables.The sub-problem is solved by Newton steps with the linesearch technique.The procedure globally converges at the linear rate.As each sub-problem in-volves only values of a corresponding feature which is more convenient than accessing an in-stance.Experiments show that out method is more efficient than state of the art methods.Document Classification has the broad applied future as the technical basis of informationfiltering, information retrieval, search engine, text database and so on. Text data sets are large-scale sparse data. The algorithm take advantage of the discreteness to gain a better performancethan other semi-supervised classification algorithm. We will use the text data set to verify thealgorithm. Our method is more accurate than multinomial mixture models and graph, and moreefficient than the state of the art S3V M methods.
Keywords/Search Tags:Semi-Supervised learning, S3V M, Entropy regularization, Coordinate descent al-gorithm, Document classification
PDF Full Text Request
Related items