Font Size: a A A

Research And Implementation Of Instance-based Heterologous Schema Matching

Posted on:2018-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:L L GuoFull Text:PDF
GTID:2348330512980192Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,almost every enterprises realized business data informationization,each enterprise has built own storage system to save their business data.Database merging caused by the merge or reorganization of corporations,cross-database query service,data integration or other applications rely on the technology of merging heterogeneous data.As a basic problem of heterogeneous data integration,the goal of database schema matching is to extract the effective feature to describe the similarity between schemas,and then find the best correspondence between all elements in the database schema.At present,most schema matching problem need to be done manually.Given the rapidly increasing number of data sources to integrate and due to heterogeneities of data source,manually matching schema become more tedious,time-consuming,error-prone,and therefore expensive process.Thus,automating this process,which attempts to achieve faster and less labor-intensive,has been one of the main tasks in data integration.Several type of solutions has been proposed after many years of research.Some method rely on the description information of schema,such as column name,data type,other methods may rely on schema instance or other type of auxiliary information.Although there are a number of epoch-breaking methods,but majority of them are lack of domain independence,and therefore most of them can only solve some specific problems and lack of generality.In this paper,a novel database schema matching method is proposed based on the analysis of main principle of current method,which uses ordered mutual information and does not rely on any description information of schema.The novel matching method solves the problem of matching schema within opaque column name and data values from two aspects of generality and efficiency.Because the method does not rely on any description knowledge and extract effective features to build a similarity model,it has strong generality.Furthermore,extensive experiments on various datasets indicate that our proposed technique outperforms earlier schema matching methods in terms of efficiency and accuracy.
Keywords/Search Tags:Schema matching, Opaque conditions, Mutual information
PDF Full Text Request
Related items