Font Size: a A A

I-XISS: An Improved Indexing And Storage Approach For XML Documents

Posted on:2008-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:F CaoFull Text:PDF
GTID:2178360242969201Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. How to storage XML data and effective query XML data becomes a research hot spot, and establishing index for XML documents is an important method for rapid and effective query XML documents.Currently the index of XML documents has three major types: path indexing, node indexing, and sequence-based indexing.Among them, the path indexing can be used to support simple path expression queries, but path indexing had no effect on the regular path expression. Node indexing can be used to support the regular path expression, but when it is long path expression, especially when the queries have many intermediate results, the node indexing would be costly for such queries. Sequence-based indexing can support various queries and it do not has to process structure-connection operation, but when it comes query with wildcards, sequence-based indexing has to transform it to simple path expression, thus affecting the efficiency of the implementation of the query.XISS (XML Indexing and Storage System) is a typical Node Indexing, which with nodes coding, three indexing structures and algorithms for the foundation, it decompose query path expression into a series of element or attribute nodes, and finally, it connecting the intermediate results to output the results of query. But unit of the concept of sub-express in XISS is an element or attribute node, thus affecting the efficiency of the query.The main contents are:(1) Combining the idea of XISS and path indexing, the structure of XISS is improved, so it can be flexible for process regular path expression, and also overcome the shortcoming of XISS.(2) Based on the improved index structure, we propose a new decomposition algorithm, these make the basic unit of sub-expression from a element or attribute node becomes a simple path expression, effectively reduce the number of sub-expression.(3) And we propose a new search algorithms, which based on the improved index structure and sub-expression decomposition algorithm, by reducing the number of sub-express and intermediate results, as a result, the query time have nothing to do with the length of the path expression, but have something to do with complications of the path expression.(4) Finally constructed an experimental system, the improved index structure, the algorithm for sub-expression and search have implemented in this system. And we compare the XISS with improved structure. The improved structure proved more efficiency than XISS.However, the improved indexing structure in the improvement of its function has a lot of work to do.Future work is: to perfect the I-XISS query algorithm, to add modifications function for I-XISS system, so the system not only can query about the XML data but also can edit (insertion, deletion, modification) XML data. Enable I-XISS to have the data operation function which a database management system should be. Finally we should also study how to coding node and process query expression when it is reference relations, so the system can preserve more complete information for XML documents, meet more complex XML data query.
Keywords/Search Tags:XML, I-XISS, Index, Sub-expression, Decomposition algorithm
PDF Full Text Request
Related items