Font Size: a A A

Cleaning Method Based On Uncertain Data DS Evidence Theory

Posted on:2015-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:J H FanFull Text:PDF
GTID:2268330431967448Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Nowadays, uncertain data is widely exists in Many database applications, such as commodity logistics, economics, finance management, military information and telecommunications. The importance of uncertain data is increasingly highlighted. A lot of data uncertainty emerged with the data scale expanding, Data sharing scope increasing and data form diversifying. At the same time, it generated a lot of dirty data. The traditional data cleaning technology cannot be directly applied to clean uncertain data, which makes the uncertainty data cleaning technology becoming a topic of great intention.Data cleansing is to detect errornous, missing or inconsistent data, which can be removed, filled or corrected to improve data quality. A lot of dirty data widely exist in uncertain data, which seriously affects data management, analysis and decision. Therefore, it has important theoretical significance and practical value to study an effective error detection method. Furthermore, it is also the subject of much difficulty and challenge.As a kind of uncertain reasoning theory, D-S evidence theory can not only represent uncertain data, but also has the ability to measure data uncertainty. Therefore, it has been widely applied to medical diagnosis, intelligence and law case analysis, multiple attribute decision analysis, target recognition and many other fields.Aiming at effectiveness and correctness highlighted in uncertain data cleansing, we make use of the advantages of D-S evidence theory in data representation and reasoning, and considering the SPJ(Selection-Projection-Join) operations. In this thesis, we propose the method for detecting errornous data in the query results based on the evidence interval of the target data items.In general, the main work of in this thesis can be summarized as follows:1. Constructing the mathematical model for detecting errornous in the uncertain data SPJ operation based on the D-S evidence theory In order to construct the mathematical model by using test data, In this thesis, aiming at the results of selection-projection-join query operations, and utilizing the traversal algorithm, we give the algorithm to build the frames of discernment in D-S evidence theory as the basis of uncertain data errornous detection.2. Detecting the errornous by the confidence interval of D-S evidence theoryBased on the frame of discernment, in this thesis we propose a method to calculate the probability value of data item by the evidence fusion algorithm and approximation algorithm. Then, we use the probability value to compute the confidence interval. At last, we use the confidence interval to detect the errornous in uncertain data.3. Experimental analysisUsing the results of uncertain data SPJ operations and based on the D-S evidence theory, we test the method proposed in this thesis. Experimental results show that our proposed method is efficient, accurate and applicable.
Keywords/Search Tags:Uncertain data, Data cleansing, Error detection, D-S evidence theory, Evidence interval
PDF Full Text Request
Related items