Font Size: a A A

Relation Extraction Based On Word Embedding And Deep Learning In Biomedical Texts

Posted on:2019-09-04Degree:MasterType:Thesis
Country:ChinaCandidate:J WanFull Text:PDF
GTID:2428330566984206Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The biomedical relation extraction is one of the important researches in the domain of information extraction which plays an important role in the development of biomedical research.As one of the task of relation extraction,Bacteria Biotope event extraction aims to extract the interactions between bacteria and habitat or geographical location in biomedical texts.The task explores the environment of bacteria,which is very important for the learning of interaction and association mechanism between the organisms and their environment.So biologists can understand the distribution of bacteria and their environment better.In the thesis,we propose a model based on the word embedding and deep learning for Bacteria Biotope event extraction to improve its performance.One of the keys for deep learning methods is the word embedding,which can affect the performance of the model.We propose biomedical domain-specific word embeddings model.Furthermore,large scale background texts for training word embeddings are not available in some scenarios,so we train word embeddings using small set of background texts.Firstly,our model incorporates a variety of biomedical and linguistic information including part of speech,stem,chunk,and named entity,which makes the word embedding more informative.Secondly,we take the advantages of SVM in solving the small data set so that the model can train word embeddings on the small scale background texts.Finally,the word embedding which is trained by our model is applied to the task of Bacteria Biotope event extraction to improve the performance of the system.We propose an attention-based bidirectional gated recurrent unit network architecture for the Bacteria Biotope event extraction.We learn input by a recurrent neural network with bidirectional gated recurrent unit to avoid complex feature selection.In addition,the information obtained from the neural network is given different weights by the attention mechanism,which makes the important information have more influence.We combine the weights and information to obtain the semantic representation of the sentences and use the softmax function to predict the label.After using the word embedding trained by our model proposed in the thesis,the method achieves an F-score of 58.21% on the test set of task of Bacteria Biotope event extraction,which is the first place on the data set.In summary,we utilize the method based on deep learning for Bacteria Biotope event extraction.In addition,the attention mechanism is added.Furthermore,the word embedding model which is proposed in the thesis is used to improve the results.The experimental results indicate the effectiveness of the framework and word embedding model.
Keywords/Search Tags:Relation Extraction, Deep learning, Word Embedding, Gated Recurrent Unit, Attention Mechanism
PDF Full Text Request
Related items