Font Size: a A A

The Research On Basic Problems Of Active XML Data Management

Posted on:2010-10-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:H T MaFull Text:PDF
GTID:1118360278996180Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The presentation of Active XML(AXML for short) addresses the problems ofheterogeneity, interoperability and autonomy occurring in data management at thescale of the Web and becomes a new powerful tool for distributed data management.An AXML document is an XML document where some of the data is given explicitlywhile other parts are defined only intentionally by means of embedded calls to Webservices. When one of these calls is invoked, its results will be returned to enrich theoriginal document.The problem of AXML data management consists of the following problems:(1)AXML Data exchange is the main application and the sender must decide whetherthe given AXML document can be rewritten into a new one conforming to the gargetschema by invoking the embedded service calls, which introduced schema rewriting;(2)The applications sometimes may consider whether all the documents conformingto the original schema can be rewritten to the target schema, named schema rewrit-ing; (3)In AXML data exchange, applications often ask data in querying manner andsatisfiability is the first condition before executing the given query. After deciding thesatisfiability of querying AXML documents, the unsatisfied ones will be refused andimprove the efficiency of queries. (4)Validation of AXML documents is the key ofAXML data management and the first condition of AXML data exchange and query-ing documents.Based on tree automata theory, document rewriting and schema rewriting, sat-isfiability of querying documents and validation of documents that are studied in thethesis. The goal of this thesis is to propose efficient algorithms to address these prob-lems and make AXML to be suitable for data management of the Web.First, problems of AXML document rewriting and schema rewriting are studied.AXML document rewriting is to decide whether the given document can be translatedto the new one conforming the garget schema by invoking some service calls embed-ded in it. AXML document rewriting contains two types: possible rewriting and saferewriting. The former is to decide whether the given document can be rewritten intoanother one conforming to the target schema; the latter is to decide whether the set of produced documents from the given AXML document can be rewritten to the docu-ments conforming to the garget schema. Schema rewriting is to decide whether all thedocuments conforming to the given AXML schema can be translated to the new onesof the target schema. Firstly, the AXML Document Tree Automata (ADTA) used torepresent AXML documents is defined, together with the building algorithm. Basedon ADTA and the defined complement of ADTA, both of algorithms, performed inpolynomial time, for deciding AXML document possible rewriting and safe rewritingare presented and the correction of them are analyzed. Secondly, the AXML SchemaTree Automata for rewriting (ASTAr) used to represent AXML schemas is also de-fined, together with the building algorithm is presented. Finally, based on ASTAr, analgorithm for deciding AXML schema rewriting is proposed which is performed inpolynomial time by analyzing the relationship between the AXML schema contain-ment and schema rewriting; the correction and efficiency are also given.Second, problem of satisfiability of querying AXML documents conforming toa given AXML schema is studied. For the efficient evaluation of a query over anAXML document, one should check whether there exists an (A)XML document ob-tained from the original one by invoking some Web services, on which the queryhas a non-empty answer. firstly, the formal definition of satisfiability of queryingAXML documents is defined. Then, a new tree automaton, named AXML SchemaTree Automata for Queries (ASTAq), is defined which can efficiently represent the setof AXML documents conforming to the given schema; a TPQA (Tree Pattern QueryAutomaton) is also defined which can represent the document set of satisfying querypathes of the given tree pattern query. Finally, based on ASTAq and TPQA, an al-gorithm for checking satisfiability of tree pattern queries for AXML documents thatruns polynomial time is proposed and experiments were made to verify the utility ofsatisfiability checking.Third, the problem of validating AXML documents is studied. Validation ofAXML document is to check whether a given AXML document with service callsspecification conforms the target schema. A new tree automaton, named AXMLSchema Tree Automaton for validation (ASTAv), is defined which can efficientlyrepresent the set of AXML documents conforming to the given schema and checkthe validation of the current state of an AXML document. Based on ASTAv, an al-gorithm is proposed for checking AXML validation performing in polynomial time through analyzing the relationship between the service calls specification and the tar-get schema. Finally, the experiment results show that our algorithm gives rise to anefficient validation method for AXML documents.
Keywords/Search Tags:Distributed data management, XML, Active XML, Web services, treeautomata
PDF Full Text Request
Related items