Font Size: a A A

Research On Biomedical Named Entity Recognition Based On Integrated Model

Posted on:2021-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:M Y GaoFull Text:PDF
GTID:2428330602489075Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The named entity recognition task is one of the most basic and,important tasks in biomedical text mining.Its accuracy for entity recognition will affect the efficiency of subsequent use of information in related medical fields.The efficiency of related tasks such as relationship extraction and event extraction will be affected.influences.In the current named entity recognition task,most of the methods used are the combination of neural networks and conditional random fields.The advantage of neural networks over traditional machine learning methods is that they do not require a lot of manual participation to avoid the waste of feature construction resources.However,there are several problems with most models at present:First,the acquisition of time series features is insufficient,and the acquisition of deep hidden information is not comprehensive.Ignoring the local features of the text,the biomedical literature generally has a long sentence structure,there are a large number of redundant function words,and important words are mixed in a large number of function words,making recognition more difficult.Second,the space is not completely use local features to convolutional neural network(CNN),represented by local feature extraction model space,for example,fast speed model,but incomplete access to information,easy to lose important information,poor recognition effect.Therefore,this paper starts with an improved model structure.We propose an integrated model based on a bidirectional long-short-term memory neural network(BiLSTM)and a convolutional neural network,and use attention mechanism(Attention)and dilated convolution kernel to check these two models respectively.Improvements were made to increase the weight of key information and obtain more extensive information.Two models in the integrated model BiLSTM-ATT-HDC,one is the BiLSTM-ATT-CRF model,which is based on the method of combining BiLSTM,Attention and conditional random field(CRF)to identify entities.It can avoid the shortcomings that traditional machine learning cannot obtain deep hidden information,and enhance the ability of deep learning methods to highlight the weight of important vocabulary.The other is a named entity recognition method combining hybrid dilated convolutional neural network(HDCNN)and CRF,which enhances the ability of deep learning methods to extract spatial local features.In the NCBI-disease dataset,the F1 value of the BiLSTM-ATT-CRF model is 83.61%,which is 1.08%higher than the base model BiLSTM-CRF.The F1 value of the integrated model BiLSTM-ATT-HDC is 84.04%,which is 1.51%higher than the base model.In summary,this paper merges two deep learning methods to improve the accuracy of named entity recognition.In the absence of other artificial features,this paper achieved a high F1 value in the NCBI-disease dataset.
Keywords/Search Tags:biomedical named entity recognition, neural network, feature fusion, hybrid dilated convolutional network, bidirectional long-term and short-term memory network
PDF Full Text Request
Related items