Study On Structural Index Technology And Query Optimization For XML

Posted on:2004-03-20

Degree:Master

Type:Thesis

Country:China

Candidate:S T Guo

Full Text:PDF

GTID:2168360095456771

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Various index techniques and join algorithms [12,13,14,15,16,23,24] have been recently proposed, in order to realize query optimization for XML. The indices are built on the tags and the element values. Nevertheless, some indices do not contain all element nodes, many paths need to still be examined in the query; other indices produce redundant data in the preorder or postorder traversal, this makes the cost of query much more. In the proposed join algorithms, although some algorithms such as MPMGIN algorithm [23], outperform standard RDBMS join algorithms, they perform a lot of unnecessary computation and I/O for matching basic structural relationships, especially in the case of parent-child relationships; other algorithms such as the Stack-Tree-Desc algorithm [24], represent the state-of-the-art in structural joins, however, they do not utilize indexed structures but sequentially scan the input lists. Thus, I/O's can be wasted for scanning element that do not participate in the join, and join speed can be influenced.According to this situation, the main works and contributions in the paper are as follow:â‘ As it is inconvenient to update that the conventional numbering schema is used to represent the structure of XML document, a sparse numbering schema based on improvements is proposed in this paper. By comparing with the conventional method, the sparse numbering schema has some merits as follow: the values of start and end do not recomputed, when a new node is inserted, the updating efficiency of tree structure is improvement, and the XML documents are traversed only once, when the schema is constructed, this further decrease the cost of building tree, and the schema can provide a durable conference for index.â‘¡ As the storage approach of the numbering schema is scarce, this paper proposes a new approach that the sparse numbering schema is stored in the relational database. By utilizing this storage approach, the indices can be easily built on the start column, and the storage space can be mostly reduced.â‘¢ This paper refers to the indexing technologies of B~+-tree in DBMS, and combine it with the sparse numbering schema, and proposes a new indexed structure â€” B~+-tree structural index. It is very important for the optimization of join operation and element location in XML query. By introducing the pointers for further improving the indexed structure, this paper proposes a B~+-tree structural index with sibling pointer (B~+-sp for shorten). This structural index can avoid the defect of always traversing B~+-tree from the boot.â‘£ Based on B~+-sp, this paper proposes Anc-Desc- B~+-sp join algorithm. It is theoretically analyzed that the time complexity of the join algorithm (O(|A|+log|A|)) is obviously less than that of the Stack-Tree-Desc algorithm (O(|A|+|D|+|outlist|) [24], because of |D|â‰¥|A|, |D|+|outlist|>>log|A|. Experiment results primarily prove that the join algorithm is more efficient and quick join algorithm.In XML query, the other important factor of influencing the query time is the problem of the location of XML data source. In order to resolve the problem, this paper proposes a distributed XML data source location system frame, called Cooperative XML Search Engine (CXSE). CXSE can shorten collection time by searching based site selection andâ‘¤ scoring Web document. Accordingly, CXSE really realize to quickly and correctly locate the URL of XML document needed. Moreover,the retrieval system is available to several XML data source in XML query.

Keywords/Search Tags:

Numbering Schema, storage, B~+-tree structural index, join algorithm

PDF Full Text Request

Related items

1	Research On Region Numbering Scheme Based XML Structural Join Algorithm
2	Query Processing And Research For Structural Join Against XML Data
3	A Study Of Coding Index Based On Schema
4	Research On Fuzzy Keywords Search Over Relational Databases And Its Optimization
5	Research On Key Techniques Of Path Expression Query Processing For XML
6	Research On Top-K Join Algorithm Based On The Star Schema
7	Research For XML Query Optimization Technology Based On Relational Database
8	Research On The Storing And Querying Of Semi-Structured Data On The Web
9	Efficient Processing Of Temporal XML Using The Structural Summary Style Method
10	Storage,Index And Management System Research And Implementation Of Mongolian Teaching Resource Based On XML