Research On Support Vector Machine Based On Improved CLIQUE Algorithm

Posted on:2018-10-29

Degree:Master

Type:Thesis

Country:China

Candidate:M Z Xu

Full Text:PDF

GTID:2348330542990932

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In recent years,with the rapid development of computer technology,the data presented in the form of massive data,high dimensional data and nonlinear data.How to extract useful information from high-dimensional and massive data is an important issue in the field of data mining.The support vector machine(SVM)and clustering algorithm as representative algorithms in data mining has attracted more and more attention.CLIQUE algorithm is high speed,and good at dealing with high dimensional.But the two parameters in the algorithm is difficult to determine,and the accuracy is not high.The data handler is required to have a strong priori estimate for the data set.SVM is accurate and efficient in dealing with small-scale data set sample classification problems,but it is very inefficient when training large-scale data sets.In this thesis,we propose a new method to improve the training efficiency of SVM by training two smaller training sets instead of a large training set for the training of large-scale data sets.Firstly,the adaptive parameters determination algorithm and the grid density record table are proposed to improve the shortcomings of CLIQUE that the difficulty of setting parameters and the loss of high density mesh.Then the improved CLIQUE algorithm is used to preprocess the data set of the support vector machine to obtain a representative sample set which can represent the distribution of the whole sample set.Through training the representative sample set to get an approximate classification of hyperplane,and we use the three distance standard proposed in this thesis to collect the samples near the hyperplane in order to get the exact training set.Finally,the optimal hyperplane is obtained by training the exact training set.The experimental results show that the speed of the distance standard 1 is the fastest,but the accuracy is not guaranteed.The accuracy of the distance standard 3 is the highest,but the speed is the slowest.The distance standard 2 is higher than the distance standard 1 in terms of accuracy and better than the distance standard 3 in terms of time complexity.Therefore,The support vector machine training time optimization algorithm using the distance standard 2,is an effective method to accelerate the training of support vector machines for large data set algorithm,which can not only guarantee the accuracy of the results of the training,but alsoguarantee faster than the original algorithm of support vector machine.

Keywords/Search Tags:

clustering algorithm, CLIQUE algorithm, support vector machine, Training time optimization

PDF Full Text Request

Related items

1	Research Of Multiattribute And Large-scale Data Classification Algorithm Based On Support Vector Machine
2	Research On The Optimization Of Clustering Algorithm Based On Support Vector Machine
3	Research On Some Problesm Of Support Vector Machine Learing Algorithm
4	Research On Support Vector Machine Accelerated Training Algorithm
5	Support Vector Machine Training Algorithm And Its Improvement
6	Research On Support Vector Machine Leaning Algorithms
7	Study On Application Of Machine Learning Based On Support Vector Machine
8	A Parallel Training Algorithm Of Support Vector Machines And Parameter Optimization Based On Genetic Algorithm
9	Acceleration And Application Of Support Vector Machines
10	Research On Training Algorithm And Preprocessing Algorithm Of Support Vector Machine