Font Size: a A A

Research On Query Process Technology For Continuous Uncertain XML

Posted on:2013-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z ZhengFull Text:PDF
GTID:2248330392954335Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The limitations of human cognition, the differences between information descriptions, themeasurement errors and even the dynamic changes of data, can usually generate many uncertaindata. In recent years, uncertain data has been got the extensive attention with the thedevelopment of technology, data acquisition and processing technology. In many practicalapplications, such as economy, military, logistics, finance and telecommunication, etc. theuncertainty of data is common and uncertain data plays a key role. The structural characteristicsof the traditional relational models do not apply to storing and processing uncertain data.However, compared with the probabilistic relational models, the XML ’s advantages such assemi-structured, self-description and higher scalability make it become more suitable in theuncertain data expression and storage. At present, the techniques of the uncertain XML datamanagement are mostly based on the discrete uncertain XML. Whereas, in many situations, theuncertainty associated with the data is distributed continuously, the data can be represented interms of a continuous probability distribution. How to manage continuous uncertain XML datahas been the focus of researchers.Now, the implementation of most query process continuous uncertain XML that has beendiscretized, it is not a very efficient approach because the query operators have to process alarge number of histogram segments during query execution. A continuous uncertain XMLquery process technology based on the p-document model was proposed in this paper. Firstly,the p-document model was expanded to support any continuous distribution by cont node, theprobability density functions and their parameters were encoded in the cont node. Secondly,found the path that meet user ’s requirements by using the twig pattern match. In the end,dealt with the continuous distributions in the leaves nodes, to decide whether a probability queryshould be executed using the symbolic form, histograms or using integrals according to the typeof continuous distributions to be queried. Standard continuous distributions used the parametersof the symbolic representation in conjunction with some sophisticated functions to compute aquery answer, non-standard continuous distributions that meet integral condition adopt theintegral method, else used the histograms approximating.After lots of experiments on the query processing strategy designed in the thesis, whichuses different probability feature calculation methods according to the type of continuousdistributions. Experimental results show that this strategy can process standard continuous distribution and non-standard continuous distributions that meet integral condition in uncertainXML effectively, and it has a higher efficiency on accuracy as well as response time thanexisting approach.
Keywords/Search Tags:P-document Model, Uncertain XML, Continuous Distribution, QueryProcess
PDF Full Text Request
Related items