Font Size: a A A

Research And Application Of The Support Vector Machine On Large-scale Datas

Posted on:2014-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:W B YangFull Text:PDF
GTID:2248330395977503Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Today support vector machine(SVM) is one of the important machine learning methods to solve the classification problems. Based on statistical learning theory, optimization and kernel trick, support vector machine has the global optimal solution and high generalization ability. It avoids the "curse of dimensionality" and so on. It has been successfully applied in areas such as face recognition, bioinformatics, fault diagnosis, network security, text classification.SVM has advantages in small-scale, high-dimensional data pattern recognition. But facing common massive data, support vector machine needs to be perfected and generalized due to take up huge memory, long training time and other defects. Based on the SVM theory foundations, its geometric characteristics are analyzed. This thesis carries out a preliminary study on the application of large-scale data.By analyzing the geometric position of the classification hypersurface of support vectors(SVs), a step-by-step training strategy is proposed. Firstly the raw data with more samples is clustered as less data in the form of grid, and we can choose the kernel function and parameters according to the distribution of data points. SVM gains the potential SVs and preliminary decision function from training the clustered data, and then SVM arrives at the final decision-making function through further training the above results.If there are large number of the potential support vectors or nominal attributes of the original data set, segmentation processing is raised, and the final decision-making function is derived from local training results.Finally, Training on the commonly used test data, we compare these two strategies with the standard SVM algorithm in accuracy and generalization ability. Numerical results verify that the proposed SVM algorithms are efficient and effective.
Keywords/Search Tags:Support Vector Machine, Clustering, Kernel Trick, Data Mining, Large-scale Data
PDF Full Text Request
Related items