Font Size: a A A

Research On Database Schema Matching Based On Instances Clustering

Posted on:2014-10-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ZhangFull Text:PDF
GTID:2268330425966481Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the information technology, the database has become animportant tool for database management. Due to the different ways of describing the data indifferent industries and different sectors, achieving these heterogeneous data sharing hasbecome today`s hot extensive research topic in the data integration field. The first step of dataintegration is to achieve schema matching, which is semantic corresponding relations amongthe data schema elements.Most schema matching methods which have been proposed are based on the schemainformation; however any methods’ assessment results are far less than100%accuracy. Whenthe schema information is not clear or conflict, these methods are often restricted. Throughanalysis of existing methods, a database schema matching method which is making use ofschema information to help instance information clustering—DSMIC is proposed. Thismethod is divided into three modules which are preprocessing module, clustering processingmodule and mapping generation module. The genetic algorithm is used to process the schemainformation in the preprocessing module for generating a set of candidate matching. In theclustering processing module, an improved K-Means clustering algorithm is proposed tocluster the instance data of the candidate matching collection mode elements, then accordingto the clustering results we can calculate the similarity among the schema elements. In themapping generation module we make use of similarity to generate the weighted bipartitegraph and use the maximum weight matching algorithm to extract final result of the schemaelements.Finally, the experimental results verify the feasibility based on the instance clusteringschema matching method and prove that our method improves the precision, recall rate andcomprehensiveness of the schema matching to some extent.
Keywords/Search Tags:Schema Matching, Genetic Algorithms, Instance clustering, mappinggenerate
PDF Full Text Request
Related items