Research On Biomedical Named Entity Recognition Based On Deep Learning

Posted on:2019-03-02

Degree:Master

Type:Thesis

Country:China

Candidate:Y X Jiang

Full Text:PDF

GTID:2428330566984188

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Biomedical named entity recognition is an important preliminary step for many biomedical information extraction tasks,such as biomedical relationship extraction and event extraction.The current mainstream methods for biomedical named entity recognition are based on the neural networks to avoid the complex hand-designed features derived from various linguistic analysis.However,the performance of existing neural networks does not achieve the optimal shallow machine learning method.Therefore,how to use neural network to improve the performance of biomedical named entity recognition is the main content of this thesis.In order to avoid the conventional neural network ignoring some potential word-level and sentence-level semantic information,we propose a novel Long Short Term Memory?LSTM?Networks model integrating two channels and sentence-level reading control gate.For the input,two channels are extended in the architecture to pick up the information from the pre-training and fine-tuning word embeddings respectively.Then,a sentence-level reading control gate is introduced into our model to decide what information should be retained or discarded for the future time steps.Finally,we utilize the CRF model to efficiently model tagging decisions dependently.The experimental results show that our method can achieve an F₁-score of 89.49%on the BioCreative II GM corpus.Although two channels are integrated in the network can consider richer semantic information,there are still some problems when Out-Of-Vocabulary words exist in the corpus.Therefore,we consider character-level word embeddings and language model based on LSTM-CRF integrating sentence-level reading control gate.For the input,the character-level word embeddings are extended to describe the spelling information of the word more accurately,and combine the character-level word embeddings with original word embeddings based on attention mechanism as the final input.At the same time,the language model is integrated into the neural network to learn general-purpose patterns of semantic and syntactic composition based on all available data.Then,the learned features from language model can be reused in the network to predict the label more accurately.Finally,our method obtains an89.94%F₁-score on the BioCreative II GM corpus,superior to all existing systems,and also achieves satisfactory results on the JNLPBA corpus.Overall,in this thesis,we use two deep learning architectures to improve the performance of biomedical named entity recognition.Finally,our proposed model outperforms all the existing systems on the BioCreative II GM corpus without the complex hand-designed features and post-processing,and 0.89%F₁-score higher than the current best performing system.

Keywords/Search Tags:

Biomedical Named Entity Recognition, Deep Learning, Two Channels, Reading Control Gate, Language Model

PDF Full Text Request

Related items

1	Research On Biomedical Named Entity Recognition Based On Hybrid Model
2	A Study On The Recognition Of Biomedical Named Entity Based On Statistic
3	Recognizing Named Entities In Biomedical Literatures
4	Study On Named Entity Recognition For Chinese Specific Domains Based On Deep Learning
5	Research And Implementation Of Named Entity Recognition Based On Deep Learning
6	Research On Nested Named Entity Recognition Algorithm Based On Deep Learning
7	Biomedical Named Entity Recognition And Entity Relation Extraction Based On Deep Learning Method
8	Research Of Word Representations On Biomedical Named Entity Recognition
9	Research On Chinese Named Entity Recognition Based On Deep Learning
10	The Research On Chinese Named Entity Recognition Model Based On Cascade Neural Network