Font Size: a A A

About Clustering And Classification Algorithm Research And Application On The Biomedical Data

Posted on:2013-10-20Degree:MasterType:Thesis
Country:ChinaCandidate:X LiFull Text:PDF
GTID:2248330374480262Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent year, data mining is a very active field. there are so many scholars all over theworld who are researching about data mining, And most of scholars are focus on specificapplication, not on the algorithm research, focus on laboratory development, not focus oncommercialization. At present, there are several typical data mining methods such as associationrules, classification, clustering and Web mining. Support Vector Machine algorithm is a kind ofalgorithms of data mining. After some scholars’ improvements, there is a improvement algorithmwhich is mainly used in small dataset classification, and it has better classification results thanothers, but when using it in training big dataset, will take a long time, need huge memory, thenthe mining efficiency is very low.In this paper, it introduces some knowledge about data mining and development platform ofWEKA firstly. Secondly, clustering algorithm and classification algorithm of support vectormachine algorithm was carefully introduced, in order to pave the way for diabetes data analysisand sequence of minimization algorithm. Then carefully introduced diabetes data clusteringanalysis process, using the existing diabetes data set for some clustering experiments, and hassome conclusions. Next introduced carefully sequence minimization algorithm principle and thederivation process. Then discuss the sequence of minimization algorithm’s flaw and theinsufficiency. And in response to these defect, this paper improve the algorithm by changing thestorage strategy in the WEKA software platform. After these methods, in ensuring the premise ofcorrect classification rate, it shorten the training time, reduce the amount of storage space,greatly improves the efficiency of the algorithm, make it more adapt to the massive data sets oftraining. Finally, in this paper, and the future further research work is prospected.
Keywords/Search Tags:data mining, clustering algorithm, diabetes data, Support Vector Machinealgorithm, SMO
PDF Full Text Request
Related items