Biomedical Named Entity Recognition And Entity Relation Extraction Based On Deep Learning Method

Posted on:2019-05-18

Degree:Master

Type:Thesis

Country:China

Candidate:P Yang

Full Text:PDF

GTID:2428330566484206

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Biomedical research is closely related to people's lives.Due to the biomedical domain's uniqueness,biomedical researchers usually need to read a large number of documents in order to obtain enough knowledge.However,with the rapid growth in biomedical literature,manual consult literature to obtain knowledge has been unappropriate.Therefore,automatically information extraction technology for biomedical field has received much attentions.In the biomedical information extraction domain,named entity recognition is usually a foundational task.The current state-of-the-art approaches are usually based on CRF model,which needs experts to perform difficult and expensive feature engineering,and suffer the tagging inconsistency problem that the same mentions in a document may be tagged with different labels.This thesis proposes an attention-based BiLSTM-CRF architecture to solve the problem.First,for each input token,the distributed word embeddings is used to represent the word level's representation,and a BiLSTM is used to learn the char level's representations.Besides the two representation,other additional features are also used.Then,all the input are transformed to another BiLSTM to learn the local content.With the help of attention mechanism,the model obtain the global context at document level.At last,a CRF layer is used to create the predict label sequence of an document.Automatically extracting protein-protein interaction(PPI)from biomedical literature is also a crucial task.However,the accuracy of PPI extraction methods remains low due to the limitation of available data.To solve this problem,this thesis proposes an adversarial training(AT)based approach,which can augment training datasets for PPI extraction by aggregating different datasets.Specifically,it learns task-invariant features while alleviates the interference from different datasets.In this way,the model gains a preferable generalization ability and can make a better classification on unseen samples.In our approach,a BiGRU layer is employed as a sentence encoder,and with the help of AT the BiGRU layer could be able to learn the features that cannot discriminate two different datasets: AIMed and Bio Infer.Because structured results for biomedical named entity recognition and relation extraction is not convenient for researchers to analyze and use.Therefore,Web technology is used to visualize the results so that researchers can use our proposed model to conduct relevant research.

Keywords/Search Tags:

Named entity recognition, Conditional random fields, Attention, Relation extraction, Adversarial training

PDF Full Text Request

Related items

1	Research On Entity Relation Recognition In Information Extraction
2	Research On Key Technologies Of The Information Extraction
3	Recognition Of Named Entity In Electronic Medical Records Based On Cascaded Conditional Random Fields
4	Chinese Named Entity Recognition Based On Conditional Random Fields
5	Research On Named Entity Recognition And Relation Extraction Between Entities Based On Depth Learning
6	Named Entity Recognition Based On Conditional Random Fields
7	Named Entity Recognition Based On Conditional Random Fields Chinese Research
8	The Research Of Conditional Random Fields Based Chinese Named Entity Recognition
9	A Cambodian-named Entity Recognition Study Based On Constrained Random Fields
10	Chinese Named Entity Recognition Based On Conditional Random Fields