Font Size: a A A

Research And Applications Of Incremental Bayesian Network Learning Algorightm Based On Big Data Platform

Posted on:2018-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:J HuFull Text:PDF
GTID:2348330518494471Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As an important machine learning technique, Bayesian network (BN),which has been widely used in modeling relationships among random variables, is considered to be suitable for tasks like prediction,classification and cause analysis. In fact, Bayesian network model often preforms better precision than other commonly used algorithm model in classification and prediction. Meanwhile, due to the complexity of learning BN both in time and space, as well as the difficulty of understanding its structure, the Bayesian network wasn't used in classification tasks diffusely. Besides, the BN structure learning methods are heuristic algorithms. Take Max-Min-Hill-Climbing as an example, the time complexity is unsure. Time the algorithm needs to converge can grow intensively when massive amount of calculation is required.This paper aims at lessen the time cost of the learning BN structure process. We propose a approach combined MapReduce with MMHC method to learn Bayesian network. We first split the training data in to several blocks, then learning sub Bayesian network structure simultaneously on MapReduce. Subnets derived from leaning will be used for prediction. To easily integrate prediction results from all those subnets, we employ boosting method to deal with the problem. Our experiment results show good precision in real distributed environment.For a system that needs to adjust itself dynamically, updating its model with new observation data is required.Typically, for a dynamic adjustment system, the systems need to be constantly adjusted when new observational data are added. Similarly, as new observational data become available, it is necessary to improve the ability of describing the existing network structure and accuracy of the expected results. Because in the establish phase of models, there should be the inherent biases or errors. While with the evolution of dynamics data, legacy models are difficult to adapt to the distribution of new batches. Disregarding of information in the new data could create intolerable mistakes. The updating learning tasks of the BN model aim at two targets: parameters updating and structure updating. This paper,which is based on an incremental learning algorithm of BN on the big data platform, offers incremental learning solution as well. While the solution was able to effectively express historical observations and using the leverage those data to making parameters update of and structures update.
Keywords/Search Tags:Bayesian Network, Big Data Platform, Incremental Learning, Classification, MMHC
PDF Full Text Request
Related items