Font Size: a A A

Research On Automatic Semantic Annotation For Domain Documents

Posted on:2013-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:H B ZhangFull Text:PDF
GTID:2248330362474190Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
World Wide Web has become one of the main channels of human to obtaininformation in the information age. It plays an increasingly important role in people’sdaily life、work and study、scientific research、commercial and social life. At the sametime, the number of webpage has been increasing exponentially. However, WWW isjust a medium for people to store and share information. Lacking of semanticinteroperability, it’s unable to make machines locate the desired information in the vastamounts of information quickly and accurately. In order to solve this problem, TimBerners-Lee proposed the Semantic Web which is an intelligent network and Derivingfrom WWW. WWW is document-oriented, while Semantic Web is documentdata-oriented. WWW thus becomes a better information exchanging medium by addingsuch semantic information that could be understood by machines. In order to make Webfrom the machine readable to machine-understandable state, adding semanticinformation to semi-structured or unstructured information on the Web is the mainresearch work of semantic annotation. Semantic annotation is the cornerstone of theSemantic Web.The existing semantic annotation systems still have problems as follows:Annotation systems generally annotate universal concept; however, it fails to annotategeneral-domain based on the individual characteristics of different areas. Both manualand semi-automatic annotation call for manual intervention, resulting in less scalabilityto large scale applications. In addition, the current semantic annotation systems arealmost exclusively for English documents, while Chinese documents semanticannotation systems are scanty.This article describes and analyzes the Semantic Web, ontology and the status ofsemantic annotation technology, by focusing on how to apply semantic similaritymethod to handle the automatic semantic annotation problem for Domain Documents.The main work and the features of this paper are as follows:①Based on the domain ontology, this paper introduces grammar and semanticanalysis on named entity and proposes a semantic annotation approach combiningWikipedia semantic similarity with Edit Distance. This method takes grammaticalsimilarity and semantic relevance between web resources and ontology conceptual intoconsideration, and measures the relevance between resources and the ontology; thismethod is guided by domain ontology and the result is significantly effective. ②Given the fact that few annotation tools are able to annotate Chinese resources,this paper proposes a semantic annotation approach to tackle this problem, combiningWikipedia semantic similarity and Baidu Distance. Experimental results prove theavailability of this approach.
Keywords/Search Tags:Semantic Web, Ontology, Semantic Annotation, Wikipedia semanticsimilarity
PDF Full Text Request
Related items