Font Size: a A A

Research Of The Generation Technique For Integrated Schemas In Dataspace Based On The Similarity Of Concepts

Posted on:2011-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:H X HouFull Text:PDF
GTID:2178360305460243Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of web and digital technology, data management shows new characteristics, such as versatile, heterogeneous, high-speed growth, etc. The traditional DBMS fails to fit the new characteristics nicely. So a new data management technology which is called dataspace has been proposed. To be different from the traditional DBMS, dataspace is based on the entities. It has the characteristics such as from-data-to-schema, dynamic size, pay-as-you-go and uncertainty. And the uncertainty is divided into three levels, including data, schema matching and query processing. The research of dataspace schema generation plays a very important role to query optimization.Some of the present integration methods are based on relation schemas and some complete concept matching and integration artificially, which can't reflect the characteristics of dataspace. We propose a dataspace schema generation method based on the similarity of concepts in this paper. In order to complete schema integration conveniently, we firstly extract the schema from different data. Schema extraction is aimed at structured relation data and semi-structured XML data. We complete the schema extraction by the way of transforming relation schemas to XML Schemas and extracting XML Schemas from XML data. Then we compute the similarity of concepts to complete the match and integration. Meanwhile, WordNet is used to improve the matching degree between concepts here.Compared with the existing methods, the schema generation based on the similarity of concepts completes schema generation automatically in which we do concepts match according to the similarity, get matching result through setting threshold and get the schema based on structure of tree.
Keywords/Search Tags:dataspace, schema extraction, concept, similarity of concepts, schema integration
PDF Full Text Request
Related items