Font Size: a A A

Research On Nested Named Entity Recognition Algorithm Based On Deep Learning

Posted on:2022-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:S S XuFull Text:PDF
GTID:2518306539468974Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Named entity recognition is a key subtask in natural language processing,which aims to identify inherent names and identifiers,such as names,organization names,time expressions,and geographic locations,from unstructured text.Accurately identifying named entities in the text is helpful to solve more complex problems and improve the efficiency of downstream tasks.Traditional named entity recognition is usually modeled as a sequence tagging task,which realizes the extraction of named entities by assigning a single entity tag to each word in the text,but this kind of method cannot deal with the nesting problem of entities.In order to identify nested named entities in the text,related research usually adopts a two-stage idea: first find the potential entities in the text,and then classify the potential entities according to predefined categories.However,the existing research on nested named entity recognition ignores the correlation between entity and type information,which leads to the need to perform multiple classifications in the entity classification stage,which is complicated and easily leads to incorrect judgment of the entity type.In response to the above-mentioned problems and deficiencies,the research work of the thesis is as follows:First,a nested named entity recognition model BERT-TF based on type focus and span enumeration is proposed.In the entity positioning stage,the model enumerates the subsequences of the text,and uses a convolutional neural network to extract the features of the sub-sequences.In the entity classification stage,the model uses the feature of strong correlation between entity and entity type information to add entity type features with different weights to the text representation,so as to determine the entity category in the text representation phase,and simplify the final entity classification process into binary classification reduces the complexity of entity classification.Secondly,to solve the problem of too many irrelevant subsequences during subsequence enumeration,a type information enhancement method is proposed.This method divides the text representation into non-entity word regions,possibly entity word regions and entity word regions,and adjusts the type features obtained by words in different regions.The words in the non-entity region will not be given entity type features,thus filtering some irrelevant subsequences,reducing the complexity of subsequence enumeration to a certain extent and improving the performance of the model.Finally,in order to verify the effectiveness of the proposed model in this paper,we conducted a number of experiments on two public named entity recognition data sets.The experimental results on the task of nested named entity recognition show that,among all the comparison models,the model proposed in this paper achieves the best results in terms of accuracy,recall,and F1 score,which verified the effectiveness of the model.The experimental results on the task of flat named entity recognition show that the performance of the model using the method proposed in this paper has been improved,indicating that the method proposed in this paper is also suitable for flat named entity recognition.In summary,this paper proposes the BERT-TF model to address the shortcomings of previous nested named entity recognition work.The BERT-TF model can make full use of the correlation between entities and entity type information,reduce the complexity of entity classification,and improve the accuracy of nested named entity recognition.
Keywords/Search Tags:Deep learning, Natural language processing, Named entity recognition, BERT, Convolutional neural network
PDF Full Text Request
Related items