Font Size: a A A

Research On Chinese Entity Recognition And Relationship Extraction Based On Deep Learning

Posted on:2023-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:P LuoFull Text:PDF
GTID:2568306848481474Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of modern information technology,the text data in the network is increasing constantly.These texts usually contain rich and effective information that can be used by people,but these data are often unstructured or semi-structured and cannot be used directly.In order to obtain structured,easy-to-understand,and directly usable information from these raw data,the task of information extraction emerges as the times require.Named entity recognition and entity relationship extraction,as two basic subtasks of information extraction,have important value and significance for building domain knowledge graphs,optimizing machine translation,and building intelligent question answering systems.Traditional research on entity recognition and relation extraction is mostly based on methods such as rules,statistics and machine learning.These methods are often not suitable for flexible and changeable modern text data.In order to improve the flexibility and accuracy of entity recognition and relation extraction of text data,this paper does the following work based on the deep learning framework and the traditional basic model:(1)Aiming at the particularity of domain corpus,an entity recognition model integrating attention mechanism and statistical features is proposed by studying the probabilistic and statistical features of entities constituting each word in the dataset.The experimental results show that the proposed model has good applicability and effectiveness on the domain corpus dataset,and the recall rate is significantly improved compared with BiLSTM-Attention-CRF,with an F value of 85.41%.(2)In order to improve the original model structure,a Chinese named entity recognition method based on Transformer encoder and BiLSTM is proposed to solve the problems of word vector information loss,position information loss and orientation information loss in the Chinese named entity recognition task.Layer and Position Encoding layer data information is spliced in the same dimension,avoiding the loss of word vector and position information.At the same time,the BiLSTM network layer with RNN structure is introduced to solve the problem of missing direction information and significantly improve the model effect.The experimental results show that the F1 value of the method on the MSRA dataset and the Thangka dataset reaches 81.39% and 88.35%,respectively,which effectively improves the effect of Chinese named entity recognition.(3)Aiming at the problem that the traditional attention mechanism calculates the noise data with too large range of attention and remote supervision,a relationship extraction model based on local attention mechanism and local remote supervision is proposed.The local attention mechanism reduces the scope of attention by calculating the attention through the sliding window,and establishes a knowledge base based on the local data,which reduces the proportion of noise data in the input data to a certain extent.The F1 value of the model reached 53.07% in Thangka data set and 81.49% in Baidu data set.
Keywords/Search Tags:Deep Learning, Entity Recognition, Relation Extraction, Transformer, Attention Mechanism
PDF Full Text Request
Related items