Font Size: a A A

Biomedical Named Entity Recognition Based On Local Feature Enhancement

Posted on:2021-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:Q H LuFull Text:PDF
GTID:2404330647950744Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Recognizing named entities from unstructured biomedical texts is an important part of information extraction.As a hot topic in the field of natural language process-ing research,named entity recognition has attracted extensive attention.Studies have shown that Long Short-Term Memory networks have shown a strong ability to recog-nize named entities in the general field.However,in the biomedical corpus,the complex composition of entities and the interference with a large number of unrelated words in long sentences reduce the performance of Long Short-Term Memory networks.The names of many biomedical entities are extremely long,making boundary recognition more difficult.This paper finds that strengthening local connections between word fea-tures in biomedical named entity recognition can improve performance.In order to achieve the purpose of enhancing local feature extraction in biomedical named entity recognition,this paper proposes different solutions from three perspectives,including:(1)In order to generate local context representations of words,this paper uses con-volution with different kernel sizes to strengthen local feature extraction,and combines advantages with global considerations of bidirectional Long Short-Term Memory net-work to extract word information features.Two new neural network structures with different combinations are proposed,which are the serial and parallel combination of Convolutional Neural Network and Long Short-Term Memory network.The experi-mental results show that,compared with the baseline and other latest methods,the two neural network structures proposed in this paper have made significant improvements on the biomedical corpus data set.(2)In order to fully use the advances of syntactic and semantic information such as syntactic dependency trees and predicate-argument structure,while retaining the char-acteristics of local structural information,this paper proposes a Graph Convolution Net-work based on enhanced syntax and semantic information for biomedical named entity recognition.Directed graphs are used to represent both syntactic and semantic types of information.Graph Convolutional Networks are used to generate feature represen-tations of words that are sensitive to their syntactic neighbors and semantic meaning relative to predicates.Experimental results show that making full use of dependency tree and predicate-argument structure information can improve the accuracy of biomed-ical named entity recognition.(3)In order to optimize the network structure and assign more attention to the lo-cal key features,this paper proposes a mixed attention mechanism to dynamically use global and local information.The self-attention mechanism models the words glob-ally,while the learnable Gaussian prior distribution models word locality to adaptively determine the center position and width of the local window.Then a fusion gate is introduced to combine the output of the attention block and the order information of the words in the sequence,to form a revised word output vector.Experimental results show that context information and local information are complementary,and the in-troduction of a local attention mechanism can improve the effect of biomedical named entity recognition.
Keywords/Search Tags:Biomedical Text Mining, Named Entity Recognition, Deep Neural Network
PDF Full Text Request
Related items