Font Size: a A A

Research On Distributed Data Mining Model Based On Bayesian Network

Posted on:2008-08-13Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2178360215491287Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Facing with increasingly fierce competition, information system has been used extensively in the enterprises, the commercial competitiveness of which could been enhanced by ERP. In turn, we can get some potential and useful knowledge from mining the massive data of ERP, which could be used to support business decisions. However, as the business develops and grows, the volume of data of database management which originally played a significant role increased greatly in the enterprise, so the management way of database changed gradually from centralization to distribution. How to mining these distributed database became a new challenge.The traditional data mining was just a local data analysis tool, which only can get some understandable or general knowledge from local datasets, but in the distibuted datasets environment, the node was physical distributed, and the processing data was massive, and the security and privacy of not-sharing data must be considered. For these problems, focusing the classification and prediction of data mining, a distributed data mining model based on bayesian network(DDMMBN) was introduced in this paper. This model used the Bee-gent system which has the function of mobile agent as the framework, and used the relational learning of bayesian network as the way, and used the multi-branches tree of attribute as the middle process, got the integrated bayesian network from distributed business database by learning, and use this bayesian network inference to realize the classification of customers and forecast of consumption.The multi-branches tree of attribute was intruduced in this model, which could reflect attribution eigenvalues of distributed datasets. It can be gotten by the mobile agent which accessed the distributed datasets and call algorithm of building the multi-branches tree of attribute, and then the multi-branches tree of attribute was used to creat a bayesian network. The multi-branches tree of attribute could solve the distributed problem well, it didn't need to collect all the data, which greatly reduced the burden on the network and save the local storage space. Meanwhile, because the multi-branches tree of attribute just only include the eigenvalues of attribute, which not involve the details of each data record, to a certain extent it can be a very well solution for the privacy issues of distribution datasets.In this paper, after particular explanation of Bayesian network theory, distributed data mining technology and mobile agent technology, for the customer classification and consumption forecasts in the commercial enterprises. a distributed data mining model based on bayesian network(DDMMBN) was introduced. Based on the Bee-gent system, we have built a prototype system, and used existing business data, compared with the data collection method and the weighted voting method, prove its efficiency and high classification accuracy.
Keywords/Search Tags:bayesian network, multi-branches tree of attribute, structure learning, parameters learning, mobile agent
PDF Full Text Request
Related items