Font Size: a A A

A Study Of Probabilistic Data Model Based On XML

Posted on:2009-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhangFull Text:PDF
GTID:2178360245995997Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The limitations of human cognition, the differences between information descriptions, the measurement errors and even the dynamic changes of data, can usually generate many uncertain data. With the in-depth studies on uncertainties, the uncertain characteristic of the future world has been more widely realized in the modem academia. However, it is lack of management of uncertain data in the classic Relational Model. So the probabilistic data models have gradually received widespread concern .The probabilistic relational models emerged first, which are structured and have flat storages. But they do not apply to storing and processing probability data.Since the network technology develops vigorously, the data variety and uncertainty increased too. These data often have different structures, sources and causes, so the data sources are often uneven and have greater gap in the scale, credibility and availability. So storage way which is more suitable than the structured one is needed. Moreover, with the emergence of XML and its rapid development, XML has been widely used on the Web for data expression and exchange. And compared with the probabilistic relational models, the XML's advantages such as semi-structured, self-description and higher scalability make it become more suitable in the probabilistic data expression and storage.Probabilistic data model based on XML becomes a hot subject. Current researches on XML based probabilistic data model focused on query within one single source. Due to the complexity imposed by diverse scales, reliabilities, effective times, and query counts, the management of multi-source probabilistic data should not be neglected. We present the paper to start the exploration of query under multiple data source condition. Based on the existing Model Concept, the paper proposed an extended probabilistic data model based on XML. The new model makes full use of information carried by the data source itself, like reliability, scale and so forth. This improvement removes the limitation of confining queries within one single source. Specifically, merge and query of multi-source data are supported to provide more available data.The main work of this paper is as follows: (1) In this paper, several major ways about the formation of the uncertainty data were discussed, and the research on the probability data models was focused on here. The characters and flaws about the probabilistic relational models and the XML-based data models were summarized respectively, too.(2) The paper proposed an extended probabilistic data model based on XML. The formalized definition of the new model was put forward, as well as the DTD description to be supported and query and merge algorithm. Moreover, the data-dependence problem, caused by XML-based probabilistic representation, was solved here.(3) In this paper, it proved that the model can guarantee the closure, compatibility and generalization while doing merge and query under multi-source circumstances.(4) Besides the theoretical methodology, the paper also demonstrates an implementation. The paper describes the architecture of the system and analyzes its performance though experiment.
Keywords/Search Tags:Probabilistic, XML, Merge, Data Dependence, Data Model
PDF Full Text Request
Related items