Font Size: a A A

Xml Indexing And Filtering Query A Number Of Key Technologies

Posted on:2006-03-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:X X LeiFull Text:PDF
GTID:1118360155960677Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As a tool of data exchange and information integration online, XML (extensible Markup Language) has become a new language online, with the advantages of self-description, independence of platform, etc,. More and more structure or semi-structure data is stored and exchanged in the form of XML; accordingly, it's important to solve the problems to index, filter, and query XML documents.On the basis of the characteristics and applications of XML data, this thesis addresses several key technical problems of indexing and querying XML, Labeling XML trees, indexing model of XML documents based on different schemas, XPath grouped queries optimization, filtering XML online, the prototype system of indexing and filtering XML documents. Major contributions of this thesis include:1) IRST-based research of indexing and querying XMLA novel index for the labeling rooted trees based on Leaf Order Interval Numbering Scheme (LOINS) and Inter-Relevant Successive Trees (IRST) is proposed. IsBaRTI-I, a new index for rooted tree structure data model is offered which takes the advantages of IRST, such as indexing and compressible. Furthermore, IsBaRTI-II, the space optimization version of IsBaRTI-I is also introduced. IsBaRTI-1,11 indexs the anscetor-descendant ship between nodes and the LOINS number of node by the name (label) of the node and the count of it's appearence in the rooted tree.In this way, indexing structure and numbering schema becomes an unit unity. Theory analysis and experiment result illustrates that IsBaRTI-1,11 needs more little time and capacity to build, obtain the node series and path matching XPath expressions more quickly than the previous XML indexes.2) Research of dynamic LOINSTo adopt labeling schema in XML trees can decide the anscetor-descendant ship between nodes; the cost of labeling schema affects the capacity of index and the cost of keeping index in main memory. Different from the previous indexes, which only care speeding up query, in consideration of the specialities of XML, the labeling schema proposed in this thesis-LOINS is compared with other labeling schema, such as OLD interval and prefix labeling schema. In contrast with other labeling schema, LOINS has the advantages of low length and flexibility. On the other hand, the average cost of dynamicly searching the leaf order interval on the base of IsBaRTI-II...
Keywords/Search Tags:XML, schema, index, query, filter, Inter-Related Successive Trees (IRST), Leaf Order Interval
PDF Full Text Request
Related items