Font Size: a A A

Design And Realization Of Ontology Instance-Based Data Matching Based On Semantic Web

Posted on:2009-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:S WangFull Text:PDF
GTID:2178360272476499Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The World Wide Web (WWW) was developed by Tim Berners-Lee in 1989. Afterthe development in the last several years, WWW has come into WEB2.0 from WEB1.0.In WEB1.0, people can only browse the websites, but now in WEB2.0, people can alsopublish some resources (html, photo, video, music etc…) to network. At present theWWW has been a huge global information repository. But the quantity of information inthe network is becoming bigger and bigger so that it is difficult to users to find correctresources they want. So Tim Berners-Lee introduced the concept of semantic web in 1998.In semantic web, resources have some information about their meanings (semanteme)which computers can read so that these resources can be searched and dealt withautomatically by applications or computers. In the conference at the SouthamptonUniversity he also indicated that what the semantic web needed was that there should be auniform format for each database to represent their data and merge them and make thembe public. People would not know the advantage of the semantic web if the databases inthe networks are separated from each other. The final goal of the semantic web is to makeall the knowledge people have to be a huge network and make them possible to be dealtwith by computers automatically.Generally, different databases in the network have different schemas and identifiers.In order to merge and integrate these different databases, people must know the meaningsof the data in these databases. That means the interoperation between databases(structured,semi-structured or non-structured) is the key point in the semantic web. The goal ofinteroperation is to make data be used by applications or computers which are not theirowners. Thus, just using ontologies does not reduce heterogeneity: it raises heterogeneityproblems to a higher level, which are also the main problems that limit semantic web togo further. Matching is a promising solution to the semantic heterogeneity problem, itcontains schema-based matching and instance-based matching. Many various solutions ofmatching have been proposed so far. But most of them concentrate on a schema-basedsolution, such as Lexicon-based matching approach, SAT-based matching approach,semantic matching approach, ONION System, LOM System and PROMPT System. Butthere is little concentration about instance-based solution. This thesis introduces a so-called Okkam system to solve instance-based matchingand merging problem, which is proposed by Pro. Paolo Bouquet at the University ofTrento, Italy.The main principle of Okkam is to make entities which represent the same instanceresource in different databases to be identified by a global uniform identity. That meanswhen integrating different databases, we can confirm some entities are matched bycomparing their URIs. The overall goal of the OKKAM initiative at the University ofTrento is to enable the Web of Entities, a global digital space for publishing and managinginformation about entities, where every entity is uniquely identified, and links betweenentities can be explicitly specified and exploited in a variety of scenarios. Compared tothe WWW, the main differences are that the domain of entities is extended beyond therealm of digital resources to include objects in other realms like products, organizations,associations, countries, events, publications, hotels or people; and that links betweenentities are extended beyond hyperlinks to include virtually any type of relation.This thesis designs and implements the details inside Okkam System, including thedata structure, how to search the entities, how to publish the entities, etc, then developstwo applications based on Okkam system: Okkam4N and FOAF-O-Matic, and lists theresults of making use of these tow applications, indicating that they can resolve theinstance-based matching and merging problem effectively in semantic web.Okkam4N is a plug-in for NeOn Toolkit. It essentially assigns a global uniqueidentifier called ("Okkam ID") to a newly created individual, rather than relying onmanual input of the user or the standard automatic mechanism of NeOn Toolkit.FOAF-O-Matic is a web-based application. The focal point of FOAF-O-Matic is toallow users to integrate Okkam identifier within their FOAF document in a user-friendlyway. In this way, it will be possible to merge more precisely a wider number of FOAFgraphs describing a person's social networks, enhancing the integration of informationand reach more easily the goal of the FOAF initiative.Now, the development and research about Okkam system and two applications are inthe beginning, there are still many points and problems to be solved. This is a promisingfield. And the realizing of Okkam system makes sure to accelerate the success ofsemantic web.
Keywords/Search Tags:Semantic web, Ontology, Instance, Okkam URI
PDF Full Text Request
Related items