Font Size: a A A

Research On Capturing Both Types And Constraints In Data Integration

Posted on:2007-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:J MaFull Text:PDF
GTID:2178360212995459Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
XML is rapidly emerging as a standard for exchanging business data on the World Wide Web. XML is becoming the preferred technology in domains of data publishing, data exchange, and electronic commerce and data integration, especially integration e-commerce system. At present, data exchange applications based on Web require enterprises to integrate data which is extracted from distributed and multiple different sources, and export as an XML document frequently, with the integrated XML data typically conforming to a schema predefined by the dealers. Typically a schema consists of two parts: a type specification and a set of integrity constraints; thus, the integrated data should both conform to the type and satisfies the constraints.This paper analyzes the current situation of the domestic and international data integration for XML, and researches for the problem of capturing both types and constraints in data integration.Firstly, by introducing the current data exchange applications on the Web, and the approaches of integration and system of publishing for XML is introduced in detail. Based on this, the existing method of capturing both types and constraints in data integration is analyzed.Secondly, to the schema updating question on underlying sources due to the dynamic characteristics of dealing in such data exchange applications above, a schema-directed dynamic integration framework for XML is proposed. Inspired by the idea of model management, the framework introduces high-level operators to program script and implements propagating for schema update from sources to integrated.Moreover, the consistency problem for XML documents specification is analyzed. And for the case of inconsistency on XML integrity constraints in thepresence of DTD in data integration above, the framework checks XML constraints satisfaction by constraints compilation in parallel with document generation, then takes corresponding actions. Accordingly, it guarantees the consistency between the final integrated document and the predefined XML schema, and satisfies the requirement of data exchange applications.Finally, an instance is given for illustrating the process of evaluation within the framework. Then the algorithm of query merging and the algorithm of update propogating, which were adopted during integrating and evaluating, are certified by experiment in correctness and feasibilities. The experimental results showed that both them gave the improvent for the integration framework in performance.
Keywords/Search Tags:Data integration, Schema-directed, Data exchange, Schema update, Model management
PDF Full Text Request
Related items