Font Size: a A A

The Research Of Term Relation Extraction Based On Syntax Structure

Posted on:2018-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhouFull Text:PDF
GTID:2348330512493106Subject:Computer technology
Abstract/Summary:PDF Full Text Request
At present,the amount of Internet data grows exponentially.It is remarkably significant to transform the massive data with abundant content,multiple type into knowledge effectively.At the same time,with the development and maturity of natural language processing technology,it is possible to extract useful information from the Web-Based Open Domain text and construct the knowledge graph.The term is fixed words or phrases which is used in a specific scientific field and can be used to mark things,phenomena,characteristics,relationship and process in some specialized fields correctly.Term is a powerful tool for scientific research and exchange of knowledge.The term relation reflects and represents the core knowledge of a field,which has important theoretical and practical significance for understanding the knowledge of the field and predicting future trends.In addition,the term relationship can also be applied to other relevant fields,such as information retrieval,automatic question answering system and construction of knowledge graph.However,it is time-consuming and laborious to extract term relationships from large scale corpus by hand.Therefore,automatic or semi-automatic term relation extraction is becoming one of research hot spots.In this paper,we research and discuss the open domain term relation extraction,a based on syntactic structure method was put forward,and as base of it,construct a medical knowledge graph.The main work and contributions of this article can be stated as follows:(1)We propose a bootstrapping method for acquisition temp relation patterns with high accuracy.In process of using pattern to extract the term relation,the quality of the pattern directly affects the quality of the results.We make full use of the diversity of the Web data to do bootstrapping iteration,and extend the small scale term relation seed set to large scale term relation library.We use deep learning tool word2vec to train word vector and compute the semantic similarity,according to the semantic similarity ranking,choosing the highest similarity term relation as the new seed,which avoids the semantic drift problem of traditional bootstrapping methods.(2)We propose a term relation extraction method based on dependency syntactic structure.This method relies on dependency parsing and semantic role labeling.We also propose minimum subtree pruning to extract verb-centered sentence trunk with semantic dependency relation,which can not only cover key points of information,but also reduce the noise caused by the dependency parsing errors.Based on the generalization of the pattern,we annotate relation categories by the information of the core verb and text analysis,and then transform the information into structural data and store in the SQL Server database for fast query processing.Experiments show that the relation extraction method based on syntactic structure can capture term semantic relations effectively.(3)We propose a method of constructing knowledge graph of multi-type term relation.The knowledge graph can be used to describe the concept,entity,event and the relation in the objective world with a structured form.In this paper,through effective knowledge integration,we solve the problem of knowledge dispersion,heterogeneity,redundancy and fragmentation in medical data and provide technical support to make the computer recognize human language.To verify the validity of the proposed method,we construct the medical domain knowledge graph.Experiments show that this technique has highly practical and popular value.It can achieve the term relation extraction and knowledge graph construction automation.
Keywords/Search Tags:Natural Language Processing, Term Relation Extraction, Knowledge Graph, Dependency Syntactic Structure
PDF Full Text Request
Related items