Font Size: a A A

Web Pages Classification Based On Active Learning Support Vector Machine Learning

Posted on:2014-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y Z OuFull Text:PDF
GTID:2298330452962700Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of Internet technology and the explosive growth of the mountinformation, how to automatically classify these massive data has become a research focus. Inso many automatic classification algorithms, Support Vector Machine (SVM) for its excellentability to learn and the higher classification accuracy has been adopted by researchdepartments of Internet companies.In this paper, we research current situation of support vector machine, the theoreticalbasis and the training process, the idea of active learning and the problem of Multi-classclassification. For traditional active learning algorithm to select the initial sample points, justrandom or selected according to the a priori probability and proposed new selection criteria.First of all, we use the maximum and minimum distances and K-means algorithm get thecluster center. Then, we combine the reduction in the training set and K-means algorithm gettraining samples. This method avoids the singular point. Because support vector machinehandle two types of classification, we combine with the proposed method to deal withMuti-class active learning algorithm.Finally, the improved active learning algorithm is applied to the web page classification.The experimental results show that the improved algorithm has a better recall rate andprecision rate.
Keywords/Search Tags:Support vector machine, Active learning, Reduction in the training set, Webpage classification, Multi-class classification
PDF Full Text Request
Related items