Font Size: a A A

Design And Implementation Of A Linked Data Driven Document Annotation System

Posted on:2014-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:R H CaoFull Text:PDF
GTID:2268330422951993Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Recently, annotation acquisition is becoming an essential step in training supervised classifiers. However, we all know the manual annotation has disadvantages that it is often time-consuming and expensive. Thus the possibility of recruiting annotators through Internet services is an appealing option that allows multiple labeling tasks to be outsourced in bulk, typically with low overall costs and higher efficiency.With this target, several advanced Semantic Web technologies are adopted, such asLinked Data and SPARQL query language.In this work, an annotation service is proposed to annotate and classifydocumental resources, mainly web pages, by using concepts from a domainreference ontology. For enabling that, both the reference ontology and the resourcesto be annotated are enriched with concepts from the DBpedia, the main and largestdataset in the Linking Open Data cloud. A matching algorithm working on theextracted DBpedia entries will then allow concepts from the reference ontology tobe associated with web resources (semantic annotation). This algorithm is based onthe result of document and ontology enrichment(get from Linked Open Data Cloud,especially the DBpedia part), compare them, then build connections and getrelevance scores, after filtering the scores, finally obtain a list of relevant concepts.This work is developed in the framework of the BIVEE (Business Innovationand Virtual Enterprise Environment) European project, which is about supportingSMEs (Small and Medium Enterprises) in innovation initiatives. In particular, theproposed annotation service is part of the BIVEE Innovation Observatory, acomponent in charge of crawling the web and searching for resources helpful formonitoring what happens outside the borders of an enterprise with respect to newinnovative initiatives, innovation opportunities and existing initiatives that can berelated to the enterprise’s specific activities.
Keywords/Search Tags:Annotation, RDF/OWL, Ontology, Linked Open Data Cloud, SemanticWeb, SPARQL
PDF Full Text Request
Related items