Font Size: a A A

Study And Implementation On Uncertain Query Processing Technology Using MapReduce

Posted on:2014-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:T T WuFull Text:PDF
GTID:2308330473953838Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data uncertainty in many real-world applications is widespread, such as telecommunications, military, financial, economic, logistics and other fields. Uncertain data is widely used in RFID network, market analysis, moving object track and environmental maintenance applications. Because of the importance of these applications and the rapidly increasing number of uncertain data which was collected and accumulated, to query the data has become an important task and increasingly attracted the attention of the majority of database researchers. Since the data size increase sharply and dramatically, it is difficult to accommodate such massive data to be processed on the storage servers with limited number, and to process with limited number of compute servers is impractical. Consequently, with parallelism processing techniques to handle such a massive uncertain data has become a trend.In this thesis, Google’s open source Hadoop parallel computing model MapReduce’s calculation framework and the uncertainty of Bayesian network inference techniques were studied detailedly, and taking advantages of MapReduce’s parallelism handling characteristics on processing massive data sets, a framework for parallel processing on oriented Bayesian network uncertainty reasoning’s exact inference algorithm was proposed. The algorithm realized the parallelization process of variable elimination algorithm of Bayesian network using MapReduce parallel programming models, to make joint probability of nodes in the Bayesian network marginalized.This thesis also optimized the parallel process of uncertainty reasoning based on Bayesian network. Experimental results indicated that the algorithm give full play to the parallel computing capability of cluster system, greatly improve the operating efficiency of handling massive amounts of data, effectively reduce the computational cost and time, improve the efficiency of query based on uncertain data.Finally, the research is concluded with future work and the suggestions are proposed.
Keywords/Search Tags:MapReduce, Uncertain query, Hadoop, Bayesian Network
PDF Full Text Request
Related items