Font Size: a A A

Research On Storage And Query Of XML Data Based On Labeling Scheme

Posted on:2007-10-30Degree:MasterType:Thesis
Country:ChinaCandidate:L W YueFull Text:PDF
GTID:2178360212995473Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Because XML data has its own characters that are different from the traditional data form, the mature technology of traditional database can not work efficiently. Therefore, it is necessary to research new processing method for XML data according to its special characters. And as one of the most important problems of XML data processing, how to store and query XML data efficiently has become a hot topic of the research on XML data management recently. To resolve the problem of storage of massive document without fixed scheme improve the performance efficiency of holistic path join algorithm, this paper focuses on the two important technique facets in storing and querying join algorithm on XML data, which based on fixed labeling scheme.Firstly, according to the question that most of the present labeling schemes can not efficiently support the dynamic updating and the storage retrieval, a new dynamic labeling scheme, called Pri-order, is proposed in this paper. The scheme uses a 3-tuples to represent the label of nodes; it can efficiently support the updates of XML document, the number of nodes need to re-label is very small when the updates occurred in document. At same time, the three parts of label combine with each other; it can preserve the structure of document and also can give the correct retrieval of original document.Secondly, according to the question of the present storage methods can not efficiently deal with the numerous documents without fixed scheme over the Internet, a new storage method SXBP was proposed based on the Pri-order labeling scheme in this paper. This method decomposed the document structure into nodes and stored them into relational scheme according to the type of node; the three parts of label of nodes were also mapped to the attribute of relational table. To support the query of document, we also stored all simple pathsoccurred in document into a table. The method can treat with the any documents whether they have fixed scheme or not, it also can reduce the storage space requirement with the storage of simple paths when documents have a same fixed scheme. The retrieval algorithm based on the SXBP also proposed in this paper.Thirdly, to reduce the large, useless intermediate results generated by the previous holistic path join algorithms, we modified the present algorithm TwigStackList based on the Pri-order labeling scheme in this paper. Our main technique is look-ahead read some elements in input data steams and cache limited number of them to lists in the main memory, for the elements of branching nodes in list, we pushed it into stack only when it really participated to the final solutions, or else, we will delete it form list. It can further reduce the size of intermediate result and improve the performance efficiency of algorithm, save the query time.Finally, the paper conducts some performance tests based on real datasets. The test check the performance of research of the storage method SXBP and the holistic join algorithm TJP in this paper, and compare them with those previous methods in relative fields.
Keywords/Search Tags:Labeling Scheme, Dynamic Update, Document Storage, Storage Retrieval, Holistic Join, Cache List
PDF Full Text Request
Related items