An Index Structure And Query Algorithm For XML Documents With Duplicate Labels

Posted on:2009-12-31

Degree:Master

Type:Thesis

Country:China

Candidate:J N Guo

Full Text:PDF

GTID:2178360272463520

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

As the fact standard of data expression and data exchange on internet, XML (extensible Markup Language) have already obtained fast population and widespread application. How to query XML document effectively has become an important research topic in XML research area. It is an effective method undoubtedly to introduce the index in the query. In recent years, for different XML applications, it has been put forward different index structures, such as DataGuide 1-Index, F & B and XR-Tree. These index structures can meet the different needs of a specified environment.An XML query usually converts to the structure connection operation of two node lists between contain relations and document position relations. According to the XML document structure characteristic, some nodes in these lists can be judged not do participate the connection operation. Therefore it can be first filtered with the XML document structure index. Thus, we can reduce the elements to enhance the query algorithm performance. The previous work indicated that we can carry on the filtration in each kind of structure index to enhance the query efficiency.In view of the frequency appearance phenomenon of duplicate labels in the XML documents three, this article gives an index structure RS-Index can effectively handle this duplicate label structure. Using index information in the query algorithm can filter the element which has nothing to do with the query, and achieve the goal of the enhancement of query efficiency.The main work of this thesis:(1) Proposed a XML document index structure RS-Index in the view of duplicate labels, and gave the corresponding index structure forming algorithm.(2) Proposed the corresponding filtration algorithm in the RS-Index structure. Take this filtration algorithm as the foundation, this essay has given one kind of new query algorithm that can find element sequence that satisfied the query condition. (3) Constructed an experimental system and realized the index structure, the filtration algorithm and the query algorithm in the system.(4) The RS-Index structure has been carried on a more comprehensive comparison with other similar index structure in the common data set. The empirical datum has indicated that, use of the index structure and query algorithm to carry on the filtration can enhance query processing performance in connection with XML document with lots of duplicate labels.

Keywords/Search Tags:

XML, Duplicate label, RS-Index, Filtering Algorithm, Query Algorithm

PDF Full Text Request

Related items

1	A Query Filtering Algorithm Over XML Data Stream
2	Research On Top-k Subgraph Query Algorithm Based On Double Index
3	Efficient Algorithm Research For Reachability Queries Based On Big Graph
4	Research On Exact Subgraph Matching Algorithm Of Label Graph
5	Research On Data Index And Query Result Sort Algorithm In XML Keyword Query
6	Research Of Chinese News Web Page Duplicate Detection
7	Research On The Key Techniques For XML Index And Query
8	Study On Spatial Index Structure And Spatial Query Algorithm In Supporting System Of Three Resistances
9	A Duplicate Document Detect System Based On GPU Parallel Computation
10	Algorithms Research Of Reachability Query Based On Double Label In Large-scale Graphs