Font Size: a A A

Research And Implementation Of Text-oriented Entity Relation Extraction Technology

Posted on:2022-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:X C DengFull Text:PDF
GTID:2518306338466524Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,we are currently in an age of information explosion,and all kinds of information are increasing exponentially.In this age,text information occupies a very important part,so it is particularly urgent and important to extract useful structured information from unstructured information.Information Extraction(IE)is a technology that aims at extracting structured Information from unstructured free text.The task of relational extraction is to extract structured triples from unstructured free text,which can be used to construct knowledge graph and assist information retrieval.Therefore,relational extraction plays an important role in information extraction.In relation extraction task,it has been proven to be effective that dependency structure information is utilized to extract feature.There are still two problems for the method based on dependency structure.On the one hand,due to the limited model structure,most of the work used the pruning strategy,which may lead to the loss of part of the context information,thus the model performance will be limited.On the other hand,although the graph convolution-based approach can model the tree structure well,it is faced with the problem of sparse adjacency matrix,and the nodes cannot effectively interact with more and more related nodes.In addition,the unbalanced distribution of training data also affects the performance of the model,and the huge data gap is easy to cause the prediction deviation.However,the data enhancement methods commonly used in natural language processing are limited for the relational extraction task.Therefore,this paper proposes a new attention mechanism with dependency guidance and data enhancement by integrating the description information of the relation labels and pre-training language model.First,in order to get richer and context-relevant semantic information,this work uses the pre-training language model to provide word representation vectors.In this way,due to the pre-training language model which could provide a rich semantic information with external knowledge,the model could still get enough information to predict correctly even for samples with labels that are contained few count in training set.Besides,the strategy of label vector and matching calculation combining label description information is also proposed in this work,then the conceptual information of labels is able to be introduced for data enhancement to further improve the model performance.Secondly,this paper proposes a novel bidirectional dependency guided attention model,which extract feature on dependency tree through attention mechanism,according to the characteristics of the dependency tree,we use a top-down attention as well as a bottom-up attention to fully capture the dependencies from different granularity,and alleviate the problem of sparse adjacency matrix.We also take the use of the hop embeddings to introduce hop distance to lowest common ancestor tree of entities instead of the pruning strategy,then reducing the loss of information,improves the model performance.The experimental results of this model on classical data sets of relation extraction have achieved significant results.Finally,the algorithm model is applied to the CoreNLU natural language processing platform of Institute of Computing Technology,Chinese Academy of Sciences,which proves the effective application value of the model in this paper.
Keywords/Search Tags:deep learning, natural language processing, relation extraction, attention mechanism
PDF Full Text Request
Related items