The Parallel Of Bayesian Classification Algorithms For CRM

Posted on:2004-09-20

Degree:Master

Type:Thesis

Country:China

Candidate:S G Zhong

Full Text:PDF

GTID:2168360095456766

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

Data mining is more deployed in produce and management of enterprise than formerly with the development of information technology and competition being sharpening. The trending is going and the data amount of enterprise growing, so the personal computer cannot work well now. And then we must take measures to adopt the parallel technology. With the base of data mining, in addition to management thinking and the development of management technology, the CRM (customer relationship management) is being adopted in enterprise. Contact management, increasing the power of earning from customer, customer subsection and across marketing are the important conception of CRM. Those are tightly associated with classification algorithm on the view of data mining. So the study of classification algorithm is very important to enterprise applying CRM.The often-used classification is classification by decision tree induction, Bayesian classification and Bayesian belief networks, k-nearest neighbor classifiers, rough set theory and fuzzy set approaches. Classification methods can be compared and evaluated according to the following criteria: predictive accuracy, speed, scalability, robustness and interpretability. According to the criteria, the advancement of Bayesian classification is evident. Bayesian classification is based on Bayesian theorem. It can be comparable in interpretability with decision tree and in speed with neural network classifiers. Bayesian classifiers have also exhibited high accuracy and speed when applied to large databases.Naive Bayesian classification algorithm is not satisfying when deployed to continuous attribute. Therefore, the paper proposes a new discretization method under the hint of Holte's 1R(one rule) discretization technique and the mechanism of entropy. The method merges areas from the top down. Neither it divides the area, nor it uses the recursion procedure that often used in entropy discretization. And further, it avoids the process of trying to find the split point using very values.NOW (networks of workstations) has rival performance with professional parallel system, and its performance-price rate is higher. PVM, belongs to NOW, has been used widely. The paper implements the parallel algorithm of optimization Bayesian classification on PVM, and analyzed acceleration rate and complexity. The analysis indicates that it is excellence when where is amount of class or the data is very large.Finally, the paper explains how to use the parallel algorithm in enterprise and it's meaning to enterprise.

Keywords/Search Tags:

Data mining, CRM, classification algorithm, parallel, Bayes, PVM

PDF Full Text Request

Related items

1	Research Of Classification In Data Mining Based On Bayes Technology
2	Design And Implementation Of Job Recruitment System Based On Data Mining Technology
3	Research And Implement On Data Mining Algorithm Parallel Based On Hadoop
4	Research On Text Mining Based On MapReduce
5	Research On Classification Algorithms Of Data Mining Based On Imbalanced Data Sets
6	Research And Improvement Of Attribute Weighted Naive Bayes Classification Algorithm
7	Hadoop-based Parallel Algorithm For Mining
8	Research On Bayes Method Based Data Classification
9	The Research On Discretization Oriented To Na(?)ve Bayes Algorithm
10	Research And Application On Naive Bayes Classification Algorithm