Font Size: a A A

The Research On Selectivity Estimation Based On XPath Path Expression

Posted on:2008-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:H G TangFull Text:PDF
GTID:2178330332481822Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of new Internet technology, the applications of Web service and data exchange prevail on a large scale. Due to the large amount of heterogeneous data on the Web, a standard format of data representation and data exchange is in urgent need. Fortunately, eXtensible Markup Language (XML) emerges timely, which is a standard data representing language proposed by World Wide Web Consortium (W3C) and is platform independent and self-describable. XML has become a basic format of data representation and data exchange in the Internet. In recent years, as the increase of the XML data exponent, How to query XML data quickly and accurately is one of the most heated research issues. Path is one of the most important properties of XML data. In the research of XML query optimization, selectivity cost estimation of the path expression is an important field. How to perfect the XML query optimization and improve the query efficiency from path expressions, especially the complex one is the core issue in XML query optimization.Based on the analysis and comparison of existing selectivity estimation methods of path expressions, taking XML as data model and XPath as query language, combining XML data structure and XPath path expressions charactistics, the thesis studies a selectivity estimation method based on XPath path expressions which including the following details:1. To analyze several typical selectivity estimation methods of path expressions which obtains and maintains XML statistics as the clue, and compare the performance between them synthetically.2. To transform the XPath path expressions into annotated path expressions according to the predicate containing the conditions in the XPath path expressions, combining XML data structure, annotating XPath path expressions on the structure and the condition, and earmark the XPath path expressions by algorithm. The disposed path expressions is terser and the semantic is clearer, and thereby conveniently for memorying.3. To build the XML statistics table for XML Query Processor according to the return result sizes from the XPath query and the XPath path expression corresponding annotated path expression, then making a selectivity estimation to XPath path expressions according to this table, and finally, do. a experiment test to the two data-bases of DBLP and Xmark, and compare estimating errors of the selectivity estimating methods which adopt the Path Tree and the Markov Table respectively to demonstrate the feasibility of this method.
Keywords/Search Tags:Query optimization, Cost estimation, XPath path expressions, XML statistics, Path annotated
PDF Full Text Request
Related items