Font Size: a A A

A Study On Algorithm Of Classification And Cluster Based On Data Mining And Realization By R Programe

Posted on:2008-08-22Degree:MasterType:Thesis
Country:ChinaCandidate:K N FangFull Text:PDF
GTID:2178360215996368Subject:Statistics
Abstract/Summary:PDF Full Text Request
DataMing is a new study realm, coming down to many subjects such as statistics,database,machine learning and so on,it was paid high attention for its strong functions and broad application.DataMining has many methods, classification and cluster are two of the most applied methods,but algorithm study is the most important field in DataMing study,whether the algorithm is good or bad will directly affect the efficiency of DataMing,so this paper will study deeply and systemly on classification and cluster algorithm.Although papers studying on classification and cluster algorithm are many,but most of many just discussed on theory,didn't realize these algorithms.This paper will emphasize the realization of algorithm and realize algorithm by R programe first in china,because R programe has advantages such as free,open source and algorithm updating quickly compared to other softwares.The first chapter of paper introduce the study background,purposes and meaning and means and frame.The second chapter introduce and compare with every algorithm of classification and realized by R programe, including the KNN algorithm based on distance,the C4.5,CART algorithms based on decision tree and the BP algorithm based on neural network.then realize these algorithms by R programe.The third chapter introduce and compared with every algorithm of cluster and realized by R programe,including the K-means,pam,clara algorithms of partitioning methods,the AGNES,DIANA of hierarchical methods,the DBSCAN algorithms of density-based methods,the COBWEB algorithms of Model-Based clustering method and the FCM algorithm of Fuzzy clustering method. then realize these algorithms by R programe.The fourth chapter is demonstration, Taking the data about the job-leaving of nurses which collected by professor cai xinling TaiWan as an example,analyse the data following the standard flow CRISP-DM.First,simply analyse the data by statistics and understand the first-step knowloge,then analyse the job-leaving willing by cluster method and establish predicted model by classification method.The fifth chapter summarize the paper and give expectation.
Keywords/Search Tags:DataMining, classification algorithm, cluster algorithm, realization by R programe
PDF Full Text Request
Related items