Font Size: a A A

Research On Key Technologies Of Fuzzy XML Data Storages And Queries

Posted on:2015-06-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:1108330482454591Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the wide application of Web technologies, Internet has become an important tool for information communication. Due to its heterogeneity of data on the Internet, more and more application systems choose XML (Extensible Markup Language) as the de-facto standard for information representation and exchange. As the next generation of the Web language, XML offers particular advantages such as extensible, platform-independent, etc. As long as applications support XML, a seamless exchange of information among these applications would be provided. XML plays a signifincat role in a Web environment, and it has become the cornerstone of intelligent management systems. Under such a background, the problem of effective XML data management has received significant attention from both academic and industrial communities, and theories and techniques of XML data storages and queries have become hot topics of database researchers.Information imprecision and uncertainty exist in many real-world applications. For this reason, a lot of interests in the research of fuzzy data managements have been stimulated. A combination of databases with fuzzy data management techniques has become the focus of research, which also creates a new set of uncertain data management requirements involving XML. Unfortunately, although some researches have devoted to investigate uncertain XML data managements in recent years, the research of fuzzy XML is still in its infant stage. Relative little work has been carried out in storing and querying XML data towards the representation of imprecise and uncertain concepts. In view of this situation, this paper proposes an effective and efficient solution and related techniques along with fuzzy data storages and queries for constructing a robust system for XML data managements. The main contributions of this paper are as follows:Firstly, to deal with the problem of fuzzy XML data storages, the issues of stroing fuzzy XML data in relational databases, and reengineering fuzzy XML in the UML data model, are investigated. In particular, an edge-based mapping approach and a query transformation approach are proposed to shred fuzzy XML data into relational data and transform XML query expressions to SQL expressions respectively. The proposed approaches effectively avoid losing semantic order information during the reengineering from the fuzzy XML data model to the relational data model. Moreover, a rule-based mapping approach is developed to reengineer fuzzy XML in the UML data model. The proposed approach avoids requiring systems to provide XML schemas during the mapping from the fuzzy XML data model to the UML data model.Secondly, based on the fuzzy labeling scheme, the issue of twig pattern queries with complex predicates is investigated. By using the holistic matching strategy, efficient approaches for answering twig pattern queries with complex predicates such as OR or NOT connectives in homogeneous fuzzy XML documents are proposed. The proposed methods avoid the re-scan problem and promote the query performance, when producing the matches. Holistic approach for twig matching in heterogeneous fuzzy XML documents is also discussed in the thesis. The proposed method avoids requiring a significant cost to integrate heterogeneous documents beforehand, and it promotes the query performance when producing the matches in heterogeneous fuzzy XML documents.Finally, to deal with the problem of few or empty answers returned by using structured queries (twig pattern queries), in response to a user query, an adaptive approximate query approach, which is based on semantic similarities, is proposed. By analyzing data distribution and speculating the factors that users are more concerned about, a weight assignment method is proposed. Then, evaluation methods for structure and content similaries are proposed, respectively. On this basis, an adaptive query relaxation approach and a ranking method are presented. The proposed approach avoids requiring users to understand the XML schemas and provide rigid query expressions, which effectively reduces the query cost, and improves the precisions and recalls.
Keywords/Search Tags:fuzzy XML data, data storage, UML data model, twig pattern queries, holistic matching, semantic similarities, approximate query
PDF Full Text Request
Related items