Font Size: a A A

Study On Redundancy Removal Of XPath Query Set In The Network XML Database

Posted on:2009-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y J XuFull Text:PDF
GTID:2178360245995530Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
XML data is self-described, can support user-defined markers and meet the need of the data description and storage on the Internet, so XML gradually becomes the actual standard of the data description and exchange on the Internet. With the rapid growth of its size and complexity, the data described and stored in XML format has attracted the attention of researchers in the field of Internet and database fields. The demand of XML data querying, locating and accessing in the Application of the Internet increases, and also leads the request of XML data's reasonable storage and fast query.With XML becoming the standard of information exchanging and description, the XML database applications have become very extensive. In all XML database applications, the XML Query accounted for a dominant position; especially there are also a variety of XML database query applications in the network environment. Under normal circumstances, the client submits a XML query to a server-side of the remote XML database in the XML database querying model for the network, the server-side query and returns the results of inquiries to the client through the network.There has been multiple optimization techniques in the application of querying XML database in the network environment, such as: query rewriting, semantic caching technology. A variety of optimization techniques is in order to speed up the response time for queries or to reduce the network traffic. Different from existing work, the paper optimizes the XML database query in network environment from a new perspective; we use a new optimization technique to reduce network traffic, namely, by removing redundancy for XPath set - to remove redundancy in two or more related enquiries XPath query result sets to optimize network traffic.In this paper, we introduce the XML, XPath query tree model, XML database query network applications and related concepts and knowledge firstly. Then, we introduce in detail several optimization technologies in the current model for XML database network query applications: semantic cache, the query rewriting using of a Materialized View and so on, and compared them with the redundancy removal technology in this paper. We pointed out their similarities and differences and described the innovation of optimization technology in this paper.In this paper we introduces an XPath set redundancy removing system and its framework, explain the function of each module in the framework, the main part of the framework is the redundancy removing algorithms of this paper. Redundancy removal algorithm of XPath set is the theme of this paper. This paper described the algorithm with two types of XPath: the simple and the complex XPath set with predicates. It firstly explained the algorithm through examples, and then proved the relevant conclusions. The algorithm uses XPath tree pattern to optimize the instance of complex XPath set with predicates.We improve the original redundancy removal solutions by importing the enquiries related estimate module and the DTD estimate module. The last one uses DTD tree to assess redundancy in different XML document structure and balance the network flow and XPath enquiries complexity in the algorithm to meet users' needs better. Finally, the paper verified the related conclusions of the algorithm by experiments and pointed out the advanced nature of the algorithm's optimization and expansion by analyzing test results.The work done in this paper has an important significance for current the XML database network model in which the network traffic still plays an important role , especially for some of billing XML database applications on the basis of the network traffic.
Keywords/Search Tags:XPath, Query Set, Redundancy Removal, XPath Pattern, DTD Tree
PDF Full Text Request
Related items