Font Size: a A A

Biomedical Named Entity Recognition And Entity Relation Extraction Based On Deep Learning Method

Posted on:2019-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:P YangFull Text:PDF
GTID:2428330566484206Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Biomedical research is closely related to people's lives.Due to the biomedical domain's uniqueness,biomedical researchers usually need to read a large number of documents in order to obtain enough knowledge.However,with the rapid growth in biomedical literature,manual consult literature to obtain knowledge has been unappropriate.Therefore,automatically information extraction technology for biomedical field has received much attentions.In the biomedical information extraction domain,named entity recognition is usually a foundational task.The current state-of-the-art approaches are usually based on CRF model,which needs experts to perform difficult and expensive feature engineering,and suffer the tagging inconsistency problem that the same mentions in a document may be tagged with different labels.This thesis proposes an attention-based BiLSTM-CRF architecture to solve the problem.First,for each input token,the distributed word embeddings is used to represent the word level's representation,and a BiLSTM is used to learn the char level's representations.Besides the two representation,other additional features are also used.Then,all the input are transformed to another BiLSTM to learn the local content.With the help of attention mechanism,the model obtain the global context at document level.At last,a CRF layer is used to create the predict label sequence of an document.Automatically extracting protein-protein interaction(PPI)from biomedical literature is also a crucial task.However,the accuracy of PPI extraction methods remains low due to the limitation of available data.To solve this problem,this thesis proposes an adversarial training(AT)based approach,which can augment training datasets for PPI extraction by aggregating different datasets.Specifically,it learns task-invariant features while alleviates the interference from different datasets.In this way,the model gains a preferable generalization ability and can make a better classification on unseen samples.In our approach,a BiGRU layer is employed as a sentence encoder,and with the help of AT the BiGRU layer could be able to learn the features that cannot discriminate two different datasets: AIMed and Bio Infer.Because structured results for biomedical named entity recognition and relation extraction is not convenient for researchers to analyze and use.Therefore,Web technology is used to visualize the results so that researchers can use our proposed model to conduct relevant research.
Keywords/Search Tags:Named entity recognition, Conditional random fields, Attention, Relation extraction, Adversarial training
PDF Full Text Request
Related items