Font Size: a A A

Key Techniques On Xml Query Processing Based On Partially Specified

Posted on:2011-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y G LiFull Text:PDF
GTID:2198330338991116Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
XML has become a de facto standard of data representation and exchange in the Internet and it has been accepted in many fields since it was put forward by W3C in 1998. All walks of life use XML to describe their information. The amount of XML date is increasing quickly with the advent of the extensive application of XML, how to effectively and efficiently search the given XML data has become a hot research issue. Existing query mechanisms can be classified into three categories according to the features of structural information contained in their query expressions: structured query mechanism, keyword query mechanism and hybrid query mechanism. This paper focuses on query processing of hybrid query mechanism, and the main research contents are as follows.Firstly, we analyzed and summaried the current situation of XML Query methods, and found that the existing methods can not deal with the PSTP Query containing"*"nodes efficiently. We propose a method which can infer from the PSTP query the corresponding set of general structured queries, then we propose an efficient algorithm, EDPS, based on the Extended Dewey labeling scheme, to process a general PSTP query efficiently by scanning the input elements only once. The general PSTP query refers to twig queries of the general form, PSTP queries without"*"nodes and PSTP queries containing"*"nodes.Secondly, we consider the schema information of the underlying documents, and propose a new optimization method based on DTD in order to improve the efficiency of query processing by removing useless query path, which can simplify the process of query processing by decreasing the time complexity.Finally, we implemented five algorithms, include EDPS, TwigStack, TSGeneric, TJFast and pTwigStack. By comparing the number of scanned elements, running time and scalability, we test the efficiency of the five algorithms on various datasets to show the advantage of EDPS algorithm.
Keywords/Search Tags:XML, PSTP Query, Extended Dewey, Samepath axis, Query Processing
PDF Full Text Request
Related items