Font Size: a A A

The Parallel Of Bayesian Classification Algorithms For CRM

Posted on:2004-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:S G ZhongFull Text:PDF
GTID:2168360095456766Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Data mining is more deployed in produce and management of enterprise than formerly with the development of information technology and competition being sharpening. The trending is going and the data amount of enterprise growing, so the personal computer cannot work well now. And then we must take measures to adopt the parallel technology. With the base of data mining, in addition to management thinking and the development of management technology, the CRM (customer relationship management) is being adopted in enterprise. Contact management, increasing the power of earning from customer, customer subsection and across marketing are the important conception of CRM. Those are tightly associated with classification algorithm on the view of data mining. So the study of classification algorithm is very important to enterprise applying CRM.The often-used classification is classification by decision tree induction, Bayesian classification and Bayesian belief networks, k-nearest neighbor classifiers, rough set theory and fuzzy set approaches. Classification methods can be compared and evaluated according to the following criteria: predictive accuracy, speed, scalability, robustness and interpretability. According to the criteria, the advancement of Bayesian classification is evident. Bayesian classification is based on Bayesian theorem. It can be comparable in interpretability with decision tree and in speed with neural network classifiers. Bayesian classifiers have also exhibited high accuracy and speed when applied to large databases.Naive Bayesian classification algorithm is not satisfying when deployed to continuous attribute. Therefore, the paper proposes a new discretization method under the hint of Holte's 1R(one rule) discretization technique and the mechanism of entropy. The method merges areas from the top down. Neither it divides the area, nor it uses the recursion procedure that often used in entropy discretization. And further, it avoids the process of trying to find the split point using very values.NOW (networks of workstations) has rival performance with professional parallel system, and its performance-price rate is higher. PVM, belongs to NOW, has been used widely. The paper implements the parallel algorithm of optimization Bayesian classification on PVM, and analyzed acceleration rate and complexity. The analysis indicates that it is excellence when where is amount of class or the data is very large.Finally, the paper explains how to use the parallel algorithm in enterprise and it's meaning to enterprise.
Keywords/Search Tags:Data mining, CRM, classification algorithm, parallel, Bayes, PVM
PDF Full Text Request
Related items