Font Size: a A A

Research Of Uncertain Data Provenance Based On Confidence Computing

Posted on:2013-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:J XiaFull Text:PDF
GTID:2248330362970873Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Rapid development of the Internet leads to continuous data scale expansion, diversified forms of data,broard scope of data sharing, and an enormous, complex and heterogeneous data environment isformed in kinds of industries. Data in this environment may be collected originally, or obtained afterbeing copied, converted and moved around. Data provenance is researched when people begin to beconcerned about where data comes from and how it was derived from its creation to the current outputas well as data quality and reliability. At the same time, uncertainty is generated during data evolutioninevitably especially when the original data itself is uncertainty. The provenance and uncertainty size ofsource data become a critical research topic of uncertain data. However, in the field of database,traditional techniques mainly focus on deterministic data. In this thesis the provenance of uncertain datais studied, including tracking provenance and calculating value of uncertainty of data. The mainresearch contributions of this thesis are summarized as follows:Firstly, a comprehensive survey is researched on related techniques of data provenance in the field ofdatabase, the characteristics of several representative provenance models are analyzed and compared,and differences and relationships among them are pointed out.Secondly, focusing on the particularity of uncertain data, the conclusions in this thesis indicate thatWhy provenance and How provenance can both form a kind of minimal witness basis which can beused to track provenance of uncertain data and evaluate the uncertain size. After that, witness basisforming algorithm and confidence calculation algorithm are proposed. Related experiments on Triosystem show the effectiveness of the proposed algorithms.Finally, uncertain data provenance tracking system is designed. Through the effective design ofprovenance storage model, the spread rules of provenance are summarized, and the traditionalrelational algebra and the SQL language are expanded, finally the implementation of the provenancecalculation and confidence calculate of uncertain data is given.
Keywords/Search Tags:data provenance, uncertain data, provenance semiring, minimal witness basis, confidence calculation
PDF Full Text Request
Related items