On theory and applications of reuse of multiple extensible markup languages (XMLs)

Posted on:2006-07-09

Degree:Ph.D

Type:Thesis

University:University of Southern California

Candidate:Chen, Yih-Feng

Full Text:PDF

GTID:2458390008971355

Subject:Computer Science

Abstract/Summary:

The eXtensible Markup Language (XML) has been widely utilized in various domains such as multimedia applications and databases due to its flexibility and the self-describing capability. The number of XML-based markup languages grows rapidly in recent years. There exist redundancies and conflicts among a large amount of XML applications that have been designed for similar or identical purposes. A solution to this problem is to make existing XML schemas reusable by decomposing them into meaningful and properly-scaled subschemas according to their syntactic and semantic information. New XML schemas can be constructed from subschemas in the repository. How to extract XML subschemas for reuse and how to integrate subschemas are investigated in detail.;The task of integration of multiple XML subschemas, including their operations on schemas and instances, is called XML harmonization in this work. The axiom-based and object-oriented XML harmonization methodologies provide us two approaches to reuse existing XML schemas. The axiom-based methodology is applied to XML instances that have regular partial structures. Users interact with XML files stored in the XML repository by the provided primitives. The object-oriented harmonization methodology is applied to non-data-centric application domains. We apply the approach to multimedia domain as an illustrative example.;A systematic approach to the construction and organization of a repository of reusable XML subschemas is also proposed in this thesis. It consists of two main processes: schema processing and repository construction. All elements are candidates of the root of reusable subschemas. We use two weighting schemes to quantify the information of an element based on the structure and the descendents of an element. Then, they are partitioned using the K-means clustering algorithm to provide different resolutions of the repository. Subschemas rooted at the element of greater weights are chosen as reusable ones, which are located in the L highest groups. We use an ( N + 1)-tuple to represent a subschema for better and efficient storage. Tuples of subschemas are further used to remove redundancy in the repository. When the similarity measure is above a threshold, we eliminate the one with less information.

Keywords/Search Tags:

XML, Applications, Markup, Repository, Reuse

Related items

1	Design Of Reuse-oriented Education Resource Extended Service Model And Construction Of Ontology Repository
2	An Approach To Developing Component Repository In Reforming LMIS
3	The Research Of The Key Methods Of The RTCOM-based Component Repository Management
4	Researches On Component-based Software Reuse Technology
5	The Economics Model Of Domain-oriented Software Reuse
6	Design And Implementation Of Software Component Repository Based On XML Description
7	Based The Ras Enterprise-level Asset Management System Design And Applications
8	Research And Realization Of Reusable Asset Repository In Software Enterprise
9	Research On The Construction Of Present Situation And Countermeasures Of The Institution Repository In Chinese Universities
10	Compilation-Based Memory Optimizations For Scientific Computing Applications On Stream Processors