Chinese Biomedical Text Information Extraction Based On Deep Learning

Posted on:2021-04-04

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Ding

Full Text:PDF

GTID:2428330626460397

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Because the field of biomedicine is closely related to people's health,the field of biomedicine has attracted much attention.At the same time,the documents in the field of biomedicine have shown an exponential growth.These documents contain a large amount of knowledge and are a valuable resource for relevant researchers.However,knowledge extraction from documents consumes a lot of time and effort,and it is difficult to meet the needs of relevant researchers to extract knowledge from biomedical documents.Therefore,text mining technology appeared.Biomedical entity recognition is one of the basic tasks in text mining.Chinese biomedical entity relation corpus was construction by using publicly available English biomedical annotated corpora,translation technology and manual annotation methods.Then stroke ELMo was trained based on a large number of documents in the field of biomedicine.Finally,we build a model based on strokes ELMo+BILSTM+CRF to complete the entity recognition task.This model to solve the problem of polysemy and poor protein recognition.In the field of biomedicine,the short text classification of clinical is one of the important steps in the construction of an auxiliary medical diagnosis system,which has high application prospects and medical clinical value.This dissertation presents a neural network integration method based on BERT.Compared with other BERT-based models,the model can achieve higher accuracy.In terms of pre-training language models,our method uses fine-tuning techniques such as continuous training and Gradual unfreezing.In terms of training models,our method uses techniques such as pseudo-labeling and five-fold cross-training.In terms of feature representation,this method also designs a series of general features that can improve the effect of short text classification,which can alleviate the problem of short text information shortage.Compared with other BERT-based models,the model can achieve higher accuracy.In the field of biological information extraction,relation extraction is of great significance.Based on the biomedical corpus,our method builds an integrated model of BILSTM based on attention mechanism and multi-granularity Lattice,which can integrate word-level information into character sequences to avoid the influence of word segmentation errors.By using an external language knowledge base,the problem of Chinese ambiguity was avoided.Finally,the results of the model are further improved by incorporating a series of Chinese-specific features.Experimental results show that the model can well extract the relation between entities.

Keywords/Search Tags:

Entity recognition, Short text classification, relation extraction, pre-trained language model

PDF Full Text Request

Related items

1	Research On Chinese Entity Relation Extraction Based On Schemas And Pre-trained Language Models
2	Chinese Entity Relation Extraction Based On BERT And Knowledge Verification
3	Research On Chinese Short Text Classification Based On Pre-trained Language Model
4	Research Of Joint Extraction Of Entities And Relations Based On Pre-trained Model
5	Research On The Construction Of Financial Knowledge Graph Based On Deep Learning
6	Research On Entity Relation Extraction Technology Based On Deep Learning
7	Relation Extraction Based On Dualchannel Attention And Pre-Trained Language Model
8	Research On Entity Relation Extraction For News Texts
9	Research And Application Of Entity Recognition And Relation Extraction Based On Deep Learning
10	Research On Document-level Long Text Relation Extraction Algorithms