With the development of Internet technology and the explosive growth of the mountinformation, how to automatically classify these massive data has become a research focus. Inso many automatic classification algorithms, Support Vector Machine (SVM) for its excellentability to learn and the higher classification accuracy has been adopted by researchdepartments of Internet companies.In this paper, we research current situation of support vector machine, the theoreticalbasis and the training process, the idea of active learning and the problem of Multi-classclassification. For traditional active learning algorithm to select the initial sample points, justrandom or selected according to the a priori probability and proposed new selection criteria.First of all, we use the maximum and minimum distances and K-means algorithm get thecluster center. Then, we combine the reduction in the training set and K-means algorithm gettraining samples. This method avoids the singular point. Because support vector machinehandle two types of classification, we combine with the proposed method to deal withMuti-class active learning algorithm.Finally, the improved active learning algorithm is applied to the web page classification.The experimental results show that the improved algorithm has a better recall rate andprecision rate. |