Font Size: a A A

Geographic Knowledge Base Construction Of Primary Education Based On Deep Neural Network

Posted on:2017-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:S H MiaoFull Text:PDF
GTID:2428330566453047Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Currently,Humanoid intelligence has developed rapidly and some corresponding products have been used in education,medical and other industries,such as Japan Todai Robot project for examination of primary education and US IBM Waston project which is dedicated to quiz and has expanded to medical area.In China,the “Arts Comprehensive” as the representative of primary education resources contains a wealth of knowledge.The complete and high-quality knowledge base decides the humanoid intelligence level of intelligent question answering system.Therefore to construct knowledge base of primary education resources and serve for humanoid intelligent answering system is of great significance.Supported by 863 Project “The Key Technologies of Humanoid Intelligent Knowledge Understanding and Reasoning oriented to Primary Education”(2015AA015403),the thesis choose geography as the research object,using deep neural network to complete relation extraction task and constructing knowledge base preliminarily with multiple data sources based on fusion method.1)AS general relation type system and corpus doesn't suitable for geography area,construct geographic entity relation type system and corpus artificially to provide data for relation extraction.By analyzing general relation type system and “The Data Dictionary of Fundamental Geographic Information Features”,define the geographic entity relation type system.By analyzing the text features of “China Encyclopedia of China Geography”,develop labeling rules to annotate corpus in GATE platform.2)The thesis establish an end-to-end neural network by defining relation extraction as word sequence labeling problem based on character vector and then extract entity relations with the use of character feature,sentence feature and the proposed type feature,which provide data source for knowledge base.As character label depends on its adjacent characters,use general neural network extract character feature as the local feature based on a sliding window mechanism.As the information determine the character label exist in the sentence,local features can be combined by convolution and can be highlighted by max pooling,use convolutional neural network to extract sentence feature as the global feature.As type keyword that has no contextual relation with each other can represent the meaning of the sentence,use general neural network to extract type feature as the more outstanding global feature.The thesis quantifies the relation extraction performance through accurate,recall and F-Score.3)According to the characteristics of data sources,design algorithms based on text similarity to address entity alignment,property alignment and property values conflict,which exist in the fusion of multiple data sources,for providing ontology-based knowledge base instantial data.As rare words are widespread in the articles of specific areas in Baidu Encyclopedia and Wikipedia,Inverse Document Frequency feature is too outstanding to classify.As document vector takes the word order and context features into consideration,which contains more semantic information,apply the method based on term frequency – inverse document frequency and document vector to implement entity alignment.The descriptions of property name and value in encyclopedia vary from person to person because of the artificial collaborative edit.According to the feature that property name consist of one or two words,design the method based on word vector to implement property alignment.For the most of the property values in encyclopedia are represented by sentence,design the method based on document vector to resolve the conflicts of property values.
Keywords/Search Tags:Geography, Relation Extraction, Character Vector, Knowledge Base
PDF Full Text Request
Related items