Classification Algorithm Of Data Mining

Posted on:2009-06-21

Degree:Master

Type:Thesis

Country:China

Candidate:W X Guo

Full Text:PDF

GTID:2178360242983095

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the rapidly development of humanity society and computer technology, Accumulation of electronic data has taken place at an explosive rate. Undoubtedly there must be abundant latent knowledge in these electronic data of gigantic magnitude which are very important to people and traditional data analysis tools only utilize few proportion of it. Recently continually developing technic named Data Mining just can help people find latent knowledge from data. The Classification is very important method of Data Mining. Classification method can be compared and evaluated according to the following criteria: Accuracy, Speed, Robustness, Scalability, Interpretability. Among these five criteria predictive accuracy is most important. In this paper national and international popular methods of Classification are researched and analyzed in those five aspects including classification by Decision Tree, Bayesian Classification, Classification Based on Neural Network and Classification Based on Support Vector Machine.Among these methods, Decision Tree is one of the most universal models adopted. This paper focus more on the Decision Tree, involving in the decision tree building process in all major sectors, doing a more in-depth study in the major problems of decision tree encountered on the present and future development, proposing a number of effective new ways to improve the performance of Decision Tree, making own contribution to the further application of the Decision Tree. Attribute choosing, discretization and dimension reduction, what are the common areas of Decision Tree and other data-mining methods, not only can improve the performance of Decision Tree, but also can improve other data-mining methods. So it has positive significance to the development of data-mining technology.The main research contents as follows:(1) A novel dimension reduction algorithm is proposed.(2) A weighted binary search algorithm is proposed to discrete continuous attributes. (3) An improvement in the attribute selection criterion is proposed.(4) Based on the former works, optimization and conformity is applied to the classical Decision Tree. An improvement to algorithm procedure is proposed. Comparing to the C4.5 algorithm, experiment results show the superiority.

Keywords/Search Tags:

data mining, Decision Tree, discretization, dimension reduction, attribute choosing

PDF Full Text Request

Related items

1	Classification Algorithm Of Data Mining
2	The Data Mining Algorithm Based On Rough Sets
3	Research On Some Problems Of Decision Tree In Data Mining
4	Research On Decision Tree Based On Rough Set Theory In Classification
5	Research On Decision Forest Algorithm Based On Attribute Reduction
6	Research On Attribute Reduction Algorithm Based On Decision Tree And Information Entropy
7	Data Mining Research Of Vehicle Sales Based On Hash Quick Attribute Reduction Algorithm
8	Research On Decision Tree Algorithm Based On Rough Set Theory
9	Heuristic Mode Research Of Decision Tree And Its Application In Attribute Reduction
10	Research On Decision Tree Algorithm And Application Based On Data Mining