Study On XML Engine

Posted on:2005-07-19

Degree:Doctor

Type:Dissertation

Country:China

Candidate:G L Xiang

Full Text:PDF

GTID:1118360122991367

Subject:Library science

Abstract/Summary:

PDF Full Text Request

XML has been accepted by every walk of life since it was brought forward by W3C in 1998. Many walks of life adopt XML as a description language for their document & information, such as MathML, CML, VoiceML. There are many XML-format documents because many walks of life produce them and how to manage these XML-format documents is a critical problem. The thesis focuses on the problem in time. The main works of the thesis includes:(1) XML Engine designing. The thesis designs an XML Engine and throws out the relation among XML Engine, XML database and XML application system. XML Engine contains three parts: storage subsystem, index subsystem and query subsystem. Storage subsystem serve as the storing system for index subsystem and query subsystem, in addition, it supply interfaces to XML application system. Index subsystem is responsible to index XML documents in storage subsystem and includes content index and structure index. Query subsystem' s function is querying and complies to XPath 1.0, moreover it has ability to query fulltext in XML documents.(2)The XML index technology. The thesis elucidates the content index & structure index for XML documents and gives the harmony combining method between the content index and structure index. The thesis solves three issues in content index: storage for length-varing record, the Chinses word index & phrase index, enhancing the speed of index construction. The thesis uses four index files to complete the content index & structure index: Chinese Character index, English string index, element index and attribute index. The thesis first gives the pre-post node labeling method, then puts forward the tree-adjacent table, transforms the DOM-tree into tree-adjacent table, in the last creates element index and attribute index from tree-adjacent table.(3)The XML query technology. The thesis gives the content query & structure query for XML documents, and elucidates the way of integrating the content query and structure query. The thesis simplely discusses three question about content query: simple query(also called matching), field query and Boolean query. The thesis gives five basic path query expression, namely simple regular path expression, order regular path expression, attribute regular path expression, value regular path expression and Kleen closure regular path expression. The thesis summarizes four operation for the five basic regular path expression: PC operation(Parent-Child), AD operation(Ancestor-descendant), CO operation(Containment), OR operation(Order).The thesis assumes these research methods: document investigation method, logical deduction method, generalization method and demonstration method. The thesis adopts different research methods for different research objects, and guarantees the reality and credibility for the research procedure and research result.There are 44 figures and 19 tables in the thesis.

Keywords/Search Tags:

XML, index technology, query technology, structure index, structure query, engine

PDF Full Text Request

Related items

1	Research On Skyline Query Algorithm Based On New Data Index Structure
2	The Research On XML Query Technology
3	Research On The Key Techniques For XML Index And Query
4	Research On The Index Technology Of Semi-structured Data
5	A Index Technology And Query Method For XML Document Based On Textnode
6	Thor: A universal XML index for efficient XPath query processing
7	Research On Location Privacy Preserving Query Techniques In Road Network
8	Research Of Document Retrieval System Based On Fuzzy Query Technology
9	Research On F&B Index Structure Supporting XML Query
10	Subgraph Query Process On Graph-Structure Data