Font Size: a A A

Functional Dependencies For XML

Posted on:2012-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:T Y LiFull Text:PDF
GTID:2178330335950951Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid progress and development of technology and society. Computer technology has been deep into all aspects of human study, work and entertain ment.At the same time XML becomes increasingly popular, XML schema design has become an increasingly important issue.One of the central objectives of good schema design is to avoid data redundancies:redundantly stored information can lead not just only to a higher data storage cost but also to increased costs for data transfer and data manipulation. Furthermore,such data redundancies can lead to potential update anomalies, rendering the database inconsistent. One strategy to avoid data redundancies is to design redundancy-free schema from the start on the basis of knownfunctional dependencies.We observe that XML databases are often "casually designed" and XML FDs may not be determined in advance.Under such circumstances. discoveringXML data redundancies from the data itself becomes necessary and is an integral part of the schema refinement process. We present the design and implementation of the first system, Discover XML FDs, for efficient discovery of XML data redundancies.It employs a novel XML data structure and introduces a new class of partition-based algorithms. The XML data redundancies are defined on the basis of a new notion of XML functional dependency (XML FD) that (1) extends previous notions by incorporating set elements into theXML FD specification, and (2) maintains tuple-based semantics through the novel concept of Generalized Tree Tuple. Using this comprehensive XML FD notion, we introduce a new normal form (GTT-XNF) for XML documents, and provide comprehensive comparisons with previous studies.Given the set of data redundancies (in the form of redundancy-indicating XML FDs) discovered by Discover XML FDs. we describe a normalization algorithm for converting any original XML schema into one in GTT-XNF. XML data redundancies have a richer semantics than redundancies in the relational context. We proposed generalized tree tuple-based XML FD and Key notions that improve upon previous proposals and capture a comprehensive set of XML data redundancies, including in particular redundancies involving set elements. Based on those new notions, we proposed a new XML normal form. We also define fuzzy XML functional dependencies.We designed and implemented Discover XML FDs, the first XML data redundancy detection system through the discovery of XML FDs and Keys. We further designed a normalization algorithm that converts any XML schema into one in GTT-XNF giventhe set of detected redundancy-indicating XMLFDs. Experimental evaluation demonstrates that the system is practical in detecting redundancies in real datasets and scales well with increasing dataset size.In the futue the vital work is finding the XML FDs between the relational form.and make sure the fuzzy XML FDs.for the better.we hope delete the redundancy of XML data as soon as possible.
Keywords/Search Tags:XML, Functional dependency, Data redundancy, Schema design, Fuzzy XML functional dependencies, Frequent subtrees
PDF Full Text Request
Related items