Font Size: a A A

Research On Key Techniques Of Path Expression Query Processing For XML

Posted on:2004-10-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:J WangFull Text:PDF
GTID:1118360185495655Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the advent of Internet, network plays an important role in people's lives, and will be applied in more and more areas. XML has become the focus of research and industrial communities as the representative of new technologies on the Web. Traditional database technologies can't work efficiently owing to the tree-like nature of XML data and new application environment. New technologies specially designed for XML data are needed to process XML data efficiently.Query is one of the most important issues in data processing. In this paper, we focus on the path expression processing such that the key issues in the large-scale XML query application can be settled by feasible approaches. We propose a processing framework of path expressions in the native XML data management system Orient-X, focusing on three key topics: structural join algorithm, path index, and path query decomposition. We propose a range partition based algorithm to implement structural join. As the core operation of XML query processing, the efficient implementation of structural join is the key to improve XML query processing. Based on the region numbering scheme of XML data, the main idea of our algorithm is partition. It divides the input element sets into several subsets according to the coding region, and performs join oprations between corresponding subsets. Different from previous algorithms, this algorithm don't require the assumption that both the input element sets are sorted or have indexes. We propose the partition rule, and describe the concrete algorithms. The results of our comprehensive experiments show that our algorithm performs well on both synthetic and real-world datasets, and has good scalability.We propose a new schema based path index——Schema gUided Path indEx for XML data (SUPEX). Although schema is not mandatory in XML standard, DTDs often exist in practical applications. SUPEX derives index structure from schema information of DTD, and summarizes all paths that will possibly appear in XML data conforming to this DTD. We describe the architecture of SUPEX in detail, and give the procedure of contructing index structure. SUPEX can support several ways of query including absolute path expression, relative path expression, and basic structural relationships. The results of index query can be utilized in future query processing due to the coding information in index records.We propose a evaluating framework of path expressions which is guided by the target node in query pattern. We define the minimal simple path decomposition of query pattern based on basic operations and index query. The procedure of transforming query decomposition state to query plan tries to ensure that the results of an operation can be used in...
Keywords/Search Tags:XML, Path Expression, Numbering Scheme, Structural Join, Path Index
PDF Full Text Request
Related items