Font Size: a A A

Design And Implementation Of Professional Document Annotation Based On Ontology

Posted on:2015-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:X X ChenFull Text:PDF
GTID:2298330431978648Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the trend of development of net toward intelligence in information era, variouskinds of professional documents the network provides have turned to an important means andway of acquiring knowledge for people. However, with the great wealth speed of documentresources, retrieval of information resources has become the bottleneck of getting effectiveknowledge acquisition. To improve the degree of semantic relevance between the retrievalconditions and resources and make full use of the automatic reasoning function of computer,semantic annotation was carried out. In the annotation of various cyber resource, professionaldocument annotation has the least research record of all. This paper focuses on theprofessional document annotation based on the grammar structure of document and semanticstructure in domain ontology,via the study and research about the present situation oftechnology of the semantic annotation at home and abroad, so we propose a semanticannotation method based on Ontology partition technique to actualize the automaticannotation aiming at mass professionally technical documents in the network.The paper is divided into three steps: firstly, identifying the professional documents. Wechoose two works of different themes naming professional documents and literature. Weprocess them including word segmentation, document frequency statistics, ridding word,syntax analysis, manual analysis, function fitting processing, analysis scope high frequencywords of documents, in the comparison of variance and cohesion. Consequently we concludethat professional documents are more rigorous by numerical description. Secondly, domainontology partition. In order to reduce the semantic environment and improve the efficiencyand accuracy of semantic annotation, ontology is partitioned to a plurality of sub ontologyaccording to certain rules. In simple terms, PATO partition is constructing ontologydependency graph, graph cut, ontology show. Through PATO with our algorithms, wesuccessfully partition ontology into a body of ontology with fit block size. Thirdly, semanticannotation. To ensure comprehensive annotation, WordNet is used to get a frequency nearsynonym sets when matching annotation and ontology structure, and then we use the transferthe range of extended to tag graph structure getting from sentence containing key words in the help of Stanford Parser.The annotation methods in this paper have better improvement in the completeness andaccuracy and is compatible with load balance algorithm of different kinds of documents, so italso have good expanding.
Keywords/Search Tags:semantic annotation, ontology partition, cohesion standard, extendingannotation
PDF Full Text Request
Related items