Font Size: a A A

The Task Of Building Fishery Knowledge Base Based On Wikipedia

Posted on:2015-11-19Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2298330422475818Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Taking the part of knowledge source, Wikipedia could offer a convenient andquick manner to acquire professional knowledge; this paper aims to constructsemantic knowledge base based on Wikipedia. Based on the feature of fishery,this paper systematically analyzed the semantic similarity and named entitydisambiguation.To this end, the semantic similarity method is improved andbrought forward our own methods of named entity disambiguation. We extractthe infobox knowledge and represent them in N3format.Many measuring semantic similarity, with different methods, are applied inNatural Language Processing, knowledge acquisition and information retrieval.Recently, some authors have extended some of the existing methodologies tosupport multiple ontologies to improve the correlation values. This paper usedthe Featured-based method to support multiple ontologies and use heuristicfunction. At the same time, on the basis of relevant fishery data collection,validated by experiment of semantic similarity. In this article, we survey andcompare related works in ontology-based similarity assessment according to thefollowing classification: Edge-counting approaches, Feature-based measures andMeasures based on Information Content. Edge-counting approaches is simply, bymapping input terms to ontological concepts by means of their textual labels, astraightforward method to calculate their corresponding ontological nodes viais-a links. The longer the path, the more semantically far the terms are.Feature-based methods try to overcome the limitations of path-based measuresregarding the fact that taxonomical links in an ontology do not necessaryrepresent uniform distances. This is addressed by considering the degree ofoverlapping between sets of ontological features. As a result, they are moregeneral and, potentially, they could be applied in cross ontology similarityestimation settings, a situation in which edge-counting methods cannot bedirectly applied. In this paper, a feature-based method with heuristic function is proposed to deal with multi-ontology. The feature-based method of this articledoes not depend on weighting parameters, which improve the generalization ofthe semantic similarity method.Populating existing knowledge base with new facts is important to keep theknowledge base fresh and most updated. Before importing new entities in thenew knowledge can be linked to the entities in the knowledge base. During thisprocess, entity disambiguation is the most challenging task. There have beenmany studies on leveraging name ambiguity problem via a variety of algorithms.In this paper, we propose an entity linking method based on SemanticKnowledge where entity disambiguation can be addressed by retrieving a varietyof semantic relation and analyzing the corresponding documents with similaritymeasurement. This research has two main aims. First, based on infoboxknowledge of Wikipedia sources, Baidu Encyclopedia and Hudong Encyclopediato return synsets. Second, if there are several candidate entity resources in theknowledge base that may be relevant to the target entity in the piece of short text,then we need the entity disambiguation process to decide which candidate shouldbe linked to. Base on these considerations, we propose the EntityDisambiguation Algorithm.This paper systematically introduced the representation of a fishery knowledgebase and data mining based on information retrieval. Through the introductionabove, we can conclude as follows:1)The building of the knowledge base is stillin the very early stages.2)manual intervention is very important.3)structured datahave a decisive role in the building of the knowledge base.4) The major searchengines using the algorithm which is mature to ensure the quality of theknowledge base.5) The knowledge card is relatively discreetly6) The morecomplicated natural language queries will emerging (named entitydisambiguation algorithm)...
Keywords/Search Tags:semantic similarity, named entity disambiguation, Knowledgebase
PDF Full Text Request
Related items