Research On Chinese Relation Extraction For Complex Text Structure

Posted on:2022-02-06

Degree:Master

Type:Thesis

Country:China

Candidate:Z R He

Full Text:PDF

GTID:2518306554471354

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Information extraction is an important branch of natural language processing.Its function is to extract structured data from unstructured or semi-structured text.One of the most critical sub-tasks of information extraction is relation extraction.The relationship extraction methods based on deep learning are divided into pipeline extraction and joint extraction.However,the traditional methods in dealing with complex text structure can not achieve good results,often can not deal with the relationship overlap problem and the noise information generated in the extraction process.This paper focuses on the research of Chinese entity relation extraction for complex text structure and proposes two optimization schemes for the traditional pipeline method and joint learning framework.The main research and innovation contents of this paper are as follows:(1)A pipeline extraction method based on LSTM and similarity calculation is designed.When using the pipeline method for entity recognition and relationship classification,the association between entities and relationships is split,especially in complex texts with overlapping relationships,the extraction results may be greatly affected by noise.LSTM can extract specific entity objects more accurately by training labeled corpus data.Combined with the joint extraction and annotation strategy of entity-relationship,the pattern of relationship extraction can avoid too pipelining.In the experiment,firstly,the neural network model is used to complete named entity recognition,and then sentence-level attention is used to classify relations based on traditional LSTM.During the experiment,dependency grammar is introduced to extract structured entity relations to enrich semantic features,and the classification weight of relations is adjusted according to similarity calculation.After verification,it is found that the F1 score of this optimization method is 2.76% higher than that of the basic LSTM model in Chinese datasets,and the highest score is obtained in different datasets.Experimental results show that the method can reduce the influence of noise in the text and achieve a good optimization effect.(2)A joint relation extraction method based on dilated convolution and word mixing embedding is designed.The main strategy of the model is to predict the object directly through the main entity based on the idea of sequence to sequence decoding.First,the words and words of the input text are coded separately,then the words vector obtained are mixed and embedded,and the position information is introduced for the input sequence.Then the feature vector is introduced into a convolutional neural network to train iteratively.The coding process of Chinese characters is optimized by using semi pointer and semi annotation structure.The main entity is used to predict the object entities corresponding to each relationship,and the self-attention mechanism is added to reduce the noise information impact.The experimental results show that the F1 score of the model in Chinese datasets is 1.88% higher than the control model,and the precision score of the model in the public datasets is 87.6%,and the recall rate is excellent in different data sets.The experimental results show that the joint extraction method not only simplifies the extraction process,solves the problem of relation overlapping,but also has better robustness and universality in the face of Chinese corpus with multiple relationships.

Keywords/Search Tags:

Relation extraction, Deep Learning, Long term and short term memory network, Dilated convolution, Attention mechanism

PDF Full Text Request

Related items

1	Research On Relation Classification Via Bidirectional Long Short-Term Memory Networks With Attention Mechanism
2	Research On Text Classification Method Combining Attention Mechanism And Bi-GRU
3	Application Of Relation Extraction Based On Attention And Graph Convolutional Network
4	Research On Network Intrusion Detection Method Based On Bi-LSTM
5	Research On Chinese Event Extraction Via Incorporating Attention Mechanism And Long Short-Term Memory Networks
6	Research On Entity Relation Extraction Technology Based On Dilate Gate Convolutional Neural Network
7	Research And Application Of The Short-term Memory Network For Adjusting Gate Length
8	Research On Text Classification Of Chinese News Based On Deep Learning
9	Chinese Sign Language Recognition Based On Convolutional Network And Long Short Term Memory Network
10	Research And Implementation Of Entity Relation Extraction Based On Generative Adversarial Networks