A Comparative Analysis Of Classifying Algorithms In Data Mining Technology

Posted on:2008-10-31

Degree:Master

Type:Thesis

Country:China

Candidate:M C Zheng

Full Text:PDF

GTID:2167360215467588

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

The problem of classification is a major subject of research in data mining technology. Classification is the technology for building a model according to the characteristics of the data set and assigning categories to samples of unknown type by means of the model. At present classification algorithm includes statistical classification, decision tree and nerve network and so on. Different classification methods will produce different classification models. The quality of the classification model has a direct effect on the efficiency and accuracy of data mining. Therefore, it is of vital importance to choose the most effective algorithm when classifying large quantities of data.So far studies of classification algorithm of data mining fall into several types: survey of classification algorithm, improvement on classification algorithm, combination of certain classification algorithms, experimental studies of classification algorithm under the condition of small samples, studies and application of a given single classification algorithm. At present, most researchers tend to put forward new algorithms but seldom conduct experimental analysis or comparison of algorithms. Especially lacking are Contrastive studies of all existing algorithms used in classifying a particular data set. In order to fill this gap, the present paper makes an in-depth study of the problem of classification in data mining through concrete examples, analyzing and comparing the characteristics of each algorithm. It is concluded that the neural network algorithm has a better overall effect. We also find that different types of data set, data sets of different domains, different classification patterns, different criteria of comparison and different classification methods will all produce different results. Therefore, different classification methods must be used with different data sets according to their own characteristics and classification patterns. Only in this way can we expect to reduce errors to the minimum and ensure high accuracy of classification results.

Keywords/Search Tags:

Data mining, Classification, Logistic regression, Bayes, Decision tree, Nerve network

PDF Full Text Request

Related items

1	Analysis On The Influence Factors Of Golden State Warriors Victory In NBA Based On Data Mining Technology
2	Based On Data Mining Of Sino-german Housing Savings Bank Client Precise Marketing Research
3	Research On User Behavior Analysis Of Network Teaching Platform Based On Data Mining
4	Research On The Application Of Decision Tree In The Employment Guidance For College Students
5	Research And Application Of Education Data Mining Based On Decision Tree Technology
6	Data Mining Techniques Application And Research In Human Resource Market
7	Research Based On Data Mining Techniques And Its Application In Reform School Students
8	Research And Application Of Student Performance Prediction Algorithm Based On C5.0 Decision Tree Algorithm
9	Prediction Of Weibo Member Loss
10	Based On Decision Tree Network Of College Studentsâ€™ Academic Research Influence Factors