Font Size: a A A

Research On Temporal XML Index Base On Feature Vector

Posted on:2011-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:T K ZhengFull Text:PDF
GTID:2178360305452011Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Due to the continuous development of network technology, web service and wide application of e-commerce, XML has become standards of data representation in web application and data exchange. It carries the semantics of the data when representing data, and can be used as an intermediate format for all platforms. XML database technology has been development well with a large number of outstanding achievements. It is practical significant to introduce time information into XML data because time is an important element of commercial applications and can be used to represent valid time of data. Temporal XML database is the extension of the XML database by increasing the support of temporal properties. It can track historical data, or restore the data to state of any time before. Researchers are more and more concerned about the temporal XML databases both home and abroad.The demand of query processing ability is increasing as the amount of data becomes larger and larger over time How to build efficient indexes becomes an important issue for temporal XML data technology as for non-temporal ones. In temporal XML index areas, most of them generate a new version when updating temporal XML document and need to traverse different versions to complete query processing. While others focus on drawing on the non-temporal XML database indexing technology to extend them to build temporal index by adding temporal support to query processing and index maintenance respectively. The non-temporal XML indexes can be classified into two categories, node-record-style index and structural summary-style index. These ideas of constructing index are good enlightening for the temporal XML indexing technology at the initial stage.In the temporal XML model used in this paper, we store data with different valid time in the same document, rather than time-division version of data is kept separate. This paper presents a new temporal XML Indexing TFIX making good use of the temporal characteristics of XML documents. Its basic idea is that in dealing with queries, it only searches the parts may contain query results rather than traverse the entire document. For large temporal XML documents, we first enumerate all sub-document chips whose depth is K, the parameter TFIX index, and use feature vector to characterize each chip. We calculate feature vector for every document chip and inserted into B+ tree as a key to build index. When dealing with queries, we also consider the twig query as query tree to calculate the feature vector, and do the matching in B+ tree to find out document chip set each of which may contain query result. At last we just need to traverse each chip in this intermediate result set simply to obtain the final results. The feature vector forwarded in this paper contains following several components:the name of document root node, the maximum and minimum eigenvalue of the corresponding matrix of document graph and valid time of root node. TFIX is based on basic graph theory and the nature of temporal XML document. This paper discusses the TFIX index construction, query processing and index maintenance algorithms and verify the performance of the index via experiment.The innovation of this paper is proposing a feature vector with pruning effect to reduce scope of traverse and improve query processing performance. The idea here is deferent from that of structural summary-style index, but we also put forward bisimulation concept for certain purpose. Experimental results show that the index has prominent query performance.
Keywords/Search Tags:temporal XML index, TFIX index, query, index maintenance
PDF Full Text Request
Related items