Font Size: a A A

Research On Context Mediation Based Semantic Information Integration Method

Posted on:2010-09-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:J F ZhouFull Text:PDF
GTID:1118360275486855Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
It's more and more pressing in the information society that by accessing the distributed and heterogeneous data sources seamlessly to obtain co-relate information so that information can be reused, shared and interoperation between information systems can be conducted. However the widely existing of heterogeneity among data sources hinders the implement of the requirement. One of the corn problem of information integration is to solve the heterogeneities among distributed data sources.Ontology-based semantic information integration adopts ontology to construct global schema and establish semantic mappings between ontology and the data sources schema to eliminate the semantic heterogeneity. The inadequacy of this method is that schema mapping can only solve the schema-level heterogeneity among data sources but this kind of heterogeneity is not the only semantic heterogeneity and there are other kinds of semantic heterogeneity exists. The results returned from information integration system which resolve heterogeneity partially not only can not let users share and reuse of existing information but cause confusion and misunderstanding to the user. What is worse it will even lead users to make the wrong judgment.To solve more kind of heterogeneity, the semantic heterogeneity among distributed data sources must be intensively researched and analysised. The context of information is implicated semantic of the data sources schema. Computer can not capture such semantic, let alone deal with it. To solve this kind of heterogenrity by computer automaticly it is nessasery to research the method of representing the context formally. A set of tightly coupled context representation and mediation mechanism is proposed and quadruple (D, S, CV, F) is adopted to represent it. On the basis of the context representation a context mediation mechanism is introduced to the ontology based information integration and the old system represented as triple (G,S,M) is extended to the one represented as quintuple (G, S, C, M, B). In the extended system context heterogeneity can be detected and resolved automatically.To solve context heterogeneity the core problem is to carry out the context translation between diferrent context values which belong to the same context type. To each four situations of context heterogeneity context translating methods are proposed. The star model based context translation method is proposed to solve the unit and scale heterogeneity and the equivalence class based method to representation heterogeneity and meta data representation of format to fomat heterogeneity. The common purpose of the four translating methods is to reduce the mount of pre-defined rules (functions) and improve the adaptability, scalability, maintainability and efficiency of the context translation.Entity heterogeneity is the data level semantic heterogeneity existed among distributed data sources widely. The main problems in existing methods to this heterogenrity are efficiency and precision. A context mediation-based two-stage feature vector prosessing method is proposed to improve the efficiency of entity identification in information integration and a string comparison function based on common sub-string is designed to improve the efficiency of entity identification in information integration.On the basis of the existing semantic information integration two extensions is made to enhance the capability of resolving semantic heterogeneity of the information integration system. The first extension is to introduce the context mediation mechanism to the ontology based semantic information integration system so that the extended system can detect and resolve context heterogeneity automatically on the basis of schema heterogeneity has been resolved. The second extension is adding the entity identification method to the first extended system. After the two times extension all the three semantic heterogeneities can be resolved in the right order and the complete semantic heterogeneity resolution is formed.
Keywords/Search Tags:semantic information integration, context, context mediation, context translation, entity identification
PDF Full Text Request
Related items