Font Size: a A A

Research On Automatic Extraction Of Chinese Named Entities And Entity Relations

Posted on:2020-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y J LiuFull Text:PDF
GTID:2428330575465496Subject:Engineering
Abstract/Summary:PDF Full Text Request
Named entity recognition and entity relation extraction are two important tasks of information extraction.In this paper,according to the characteristics of the military texts,we define named entity types and entity relation types in the military texts combining with ACE 2005 Chinese dataset,and we manually label and construct the military domain annotation dataset,then we proposed two different extraction methods for handling these two tasks,including the pipeline-based method and the joint-based method.The main contents of this thesis are as follows:(1)The pipeline-based method.This method regards two tasks as independent subtasks and processes them independently.We utilize the Lattice LSTM(Lattice Long Short Term Memory)model to encode the input text and integrate word information matching the vocabulary,then the CRF layer is used to decode for recognizing Chinese named entities.On the basis of Chinese named entity recognition,the relation extraction problem is treated as a classification task.We use the PCNN(Piecewise Convolutional Neural Networks)model for extracting entity relations.(2)The joint-based extraction method.This method treats two tasks as a unified task,which fuses information between tasks,and this method jointly extracts Chinese named entities and entity relations.In this paper,we propose a transition-based network model and transform the joint extraction task into the generation process of the transfer action sequence by designing the state transition actions.The method firstly uses Lattice LSTM model to encode the input text and uses Stack LSTM(Stack Long Short Term Memory)to implement the stack memory function,then it utilizes the SoftMax layer to determine the next state transition action based on the current state of stack until reaches the final state.The transition-based network model can identify Chinese nested entities and jointly extract Chinese named entities and entity relations.In this paper,we conduct experiments on the ACE 2005 Chinese dataset and the military domain annotated dataset,and we evaluate the performance of the pipelinebased method and the joint-based method based on experiment results.On the ACE 2005 Chinese dataset,the F1 value of the transition-based network model on the Chinese named entity recognition has reached 75.26%,and it on the entity relation extraction has reached 41.28%.Compared with the pipeline-based method,the result of named entity recognition increases by 8.45%,and the result of entity relation extraction increases by 12.41%.The experiment results show that the joint-based method is better than the pipeline-based method for Chinese named entity recognition and entity relation extraction.
Keywords/Search Tags:Named Entity Recognition, Entity Relation Extraction, Lattice LSTM, Transition-based network, Stack LSTM
PDF Full Text Request
Related items