Font Size: a A A

Research On The Key Techniques For XML Index And Query

Posted on:2009-07-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y J FanFull Text:PDF
GTID:1118360272958850Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of Intemet,more and more information processing systems adopt XML documents as carriers for storing,exchanging,and publishing information.XML has become the de facto standard for data representation and exchange on the web.Meanwhile the structure of XML documents and query demand of web users are becoming more complicated.Graph-structured XML documents containing many reference edges and XML IR(i.e.,Information Retrieval) queries with incomplete paths and keywords are getting increasingly commonplace.Thus,XML index and query technologies meet the serious challenge.We classify XML indexes into three categories,node-record-style index for tree-structured XML data,structural-summary-style index(i.e.,structural index) for graph-structured XML data,and XML-FullText-united index.Compared with noderecord-style index,research on the latter two kinds of indexes is relatively weak,and their overall performance is low.A long distance still exists between their current level and actual application.The problems for existing structural indexes are mainly as follows: long create time,large space size,low query performance,and weak research on branching path query.To solve these problems,we propose three efficient structural indexes.In addition,the problem for existing XML-FullText-united indexes is mainly as follows:lack of united index model for XML data and full-text,and its efficient collaborative query mechanism.Therefore,we propose an efficient XML-FullText-united index.Our primary works are as follow:(1) Research on semi-dynamic structural index supporting simple path query.On the basis of Inter-Relevant Successive Trees(IRST),we introduce k-similarity,a new equivalence relation and the idea of similarity merging originated from structural index,and propose IRST(k)-index,an efficient IRST-based semi-dynamic structural index supporting simple path query.Moreover,the related algorithms and theoretical proving are presented.The IRST(k)-index,which uses k-similarity as equivalence relation and is realized by IRST,can evaluate simple path expressions efficiently.Compared with the same kind of indexes,our experiment results show that IRST(k)-index performs more efficiently in terms of space consumption and query performance,while using much less construction time.(2) Research on semi-dynamic structural index supporting branching path query.We introduce k-l-similarity,a new equivalence relation and the notions of backward and forward similarities into IRST(k)-index,and propose IRST(k,l)-index,an efficient IRST-based semi-dynamic structural index supporting branching path query.In addition, the related algorithms and theoretical proving are presented.The IRST(k,l)-index,which uses k-l-similarity as equivalence relation and is implemented by IRST,can evaluate branching path expressions efficiently.Compared with the same kind of indexes,our experiment results show that IRST(k,l)-index performs more efficiently in terms of space consumption and query performance,while using significantly less construction time.(3) Research on full-dynamic structural index supporting branching path query.We introduce the notions of k-l-bisimulation,forward similarity,and backward similarity into M(k)-index,and propose MBF(k,l)-index,an efficient full-dynamic structural index supporting branching path query.Moreover,the related algorithms and theoretical proving are presented.The MBF(k,l)-index,which uses k-l-bisimulation as equivalence relation,can evaluate branching path expressions efficiently.To avoid the over-refinement of irrelevant index and data nodes,the index makes use of the query results of frequent used path expressions(FUP) to refine itself.Compared with the same kind of indexes,our experiment results show that MBF(k,l)-index performs more efficiently in terms of space consumption and query performance.(4) Research on united index for XML data and full-text.On the basis of Inter-Relevant Successive Trees(IRST),we propose a united index model for XML tree structure and full-text,Inter-relevant Interval Successive Trees based on Successive Pattern Trees,and establish a united index mechanism for tree structure and text nodes in XML documents,XML united index.Moreover,the backward construction algorithm and collaborative query algorithm are presented.Compared with the same kind of indexes,our experiment results show that XML united index performs more efficiently in terms of expansion ratio and query performance.
Keywords/Search Tags:XQuery query, XML IR query, simple path query, branching path query, XML structural index, semi-dynamic structural index, full-dynamic structural index, XML united index, IRST
PDF Full Text Request
Related items