Font Size: a A A

The Research On XML Database Schema Normalization Based On Constraints

Posted on:2005-07-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z P ZhangFull Text:PDF
GTID:1118360125467587Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With XML becoming standard for data representation and exchange on the Web, XML data is enhancing greatly by Web exchanging and processing data , it requests more to XML database schema. Similar to relational database, if XML schema designing is not good, it will cause abnormity for inserting, deleting and updating data too. The harm of XML data abnormity is further higher than relational data as Web is open. Although there has already some achievement on correlation XML research, for example, the technology of storing, publishing, querying and optimizing on XML date, and so on, especially, it is further mature between XML and relational data transformiag. But XML data has become mainstream data on internet, if we only consider how to transform XML data into relational data, and it only holds structure information, whereas not considering XML database schema from the point of designing database, it will bring much trouble on Web data processing later, and makes many redundant and inconsistent data. In this paper, we consider from the point of database designing and study further on XML database constraints, and make direct normative processing on Web data and gain a good XML database schema. It is not only holding semantic and structural information for XML documents perfectly to satisfy the requirement of database designing but also finishing one-off XML database design which avoid repeating to design with existing methods and reduce the data redundancy to keep the consistency of Web data. Therefore it has important theoretic significance and practical value to the research on XML database schema normalization.In this paper, XML data normalization is investigated by path expression and tree tuple representation based on existing DTD and XML-Schema specification. The contributions of this paper are as follows:(1) XML functional dependency, partial functional dependency and transitive functional dependency are presented based on path expression and tree tuple and make further research on XML functional dependency constrainis. We define logical implication and cover for XML functional dependency, and give a sound and complete set of inference rules of XML functional dependency, and present an algorithm of solving normalization cover and minimum cover in PTime.(2) Based on formalized XML functional dependency definition, the different level normal forms for XML are defined, normalization rules for XMLdocuments, i.e., upgrading element rule and creating element rule are presented. The normalization algorithm based on rules are presented, and the sound of algorithms is proved by experiment.(3) The definition of key constraints for XML and a sound and complete set of inference rules of absolute and relative key are given. An algorithm of solving candidate key for XML is presented, and validity, terminabiiily aftd time complexity of that are proved.(4) The definition of XML multivalued dependency and a sound and complete set of inference rules of that are given. An algorithm of solving nonredundancy cover for XML multivalued dependency based on logical implication and cover definition is presented in PTime.(5) The methods to measure similarity between XML documents, such as set measure, linear measure and cost measure, are given. An algorithm of measuring similarity between XML documents based on machine learning with node costs is presented. It extends scope for searching XML documents, and improves recall and precision for searching XML documents by experiments.
Keywords/Search Tags:constraint, XML schema, data dependency, normalization, inference rules, implication and cover
PDF Full Text Request
Related items