Font Size: a A A

The Study Of Learning Bayesian Network From Extremely Large Or Small Datasets

Posted on:2010-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y F LiFull Text:PDF
GTID:2178360275459244Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
This thesis is about the study on learning Bayesian Network from extremely large or small datasets and its application.Extremely large datasets refer to the datasets which can't be loaded into memory in whole.Extremely small datasets refer to those expensive experimental datasets because of various limitations.Firstly,an efficient approach for incremental learning Bayesian Network with Missing values IBN-M algorithm is presented.The algorithm uses structural EM algorithm to complement the missing values in the datasets,and searches in the larger space provided by parallel heuristic strategies to escape the local minima caused by EM.Incremental learning method is also adopted in IBN-M to resolve the memory space limit caused by the extremely large dataset.The experiments show that IBN-M algorithm can learn comparatively accurate network from the extremely large dataset.IBN-M is an interesting improvement for incremental learning Bayesian Network.Secondly,an efficient algorithm FCLBN for learning Bayesian network from extremely small datasets is proposed.FCLBN uses the method of bootstrap to re-sample from the small datasets,and estimates the high confidence features of the source small datasets from the Bayesian networks learned from the re-sampling small datasets.The high confidence features are taken to guide the search of the best Bayesian network on the source datasets. After being evaluated on the standard benchmark datasets,FCLBN is applied to predict yeast protein localization.The result of the experiments indicates that the FCLBN algorithm can learn relatively accurate network from small datasets.Lastly,inspired from the learning Bayesian network from the small datasets,the algorithm MM-LBN is proposed to improve the learning method of Learning Bayesian Network from extremely large datasets.In fact,each batch data learning problem is converted to a learning problem from extremely small datasets.The guarding of feature confidence is introduce to the IBN-M.The result of the experiments indicates that the combination of IBN-M and FCLBN can learn better accurate network from extremely large datasets.
Keywords/Search Tags:Bayesian Network, extremely large or small datasets, incremental learning, feature confidence
PDF Full Text Request
Related items