Font Size: a A A

Association Analysis Of Genes And Brucellosis Based On Literature Mining

Posted on:2022-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:R H ZhangFull Text:PDF
GTID:2504306527993339Subject:Master of Agriculture
Abstract/Summary:PDF Full Text Request
There is a huge amount of literature in the field of biomedicine,and the number is still increasing rapidly.In terms of biomedicine,literature mining technology is mainly reflected in the ability to quickly and accurately locate and extract biomedical knowledge,and the aggregation of these knowledge has become an important source of information.In this article,document mining is embodied in the use of named entity recognition,relation extraction and document retrieval.Named entity recognition is to identify these entities from the literature.The entities are words or phrases with specific meanings in the literature,for example,diseases,proteins,genes,DNA and other entities.This article refers to the Glove-character-level BLSTM-BLSTM-CRF model to identify biomedical named entities.Using Glove and BLSTM firstly to train the word vectors of the words separately,and finally combine the representation.Then,input the entire word vector into the deep learning model of the BLSTM-CRF to identify the entity category is model has achieved better results in the JNLPBA 2004 data set.The automatic entity relation extraction in the literature provides an important basis for the construction of biomedical databases,but the existing models often ignore the sentence structure and the information contained in the target entity.This thesis uses the BERE model to process relation extraction,and this model is a combination of Attention+Bi-GRU+Gumbel tree-GRU technology.Firstly,input the trained word vector into Attention to obtain long-term dependence and add it back to the original word vector.Then,Bi-GRU is used to encode the local context through each word.Finally,use Gumbel tree-GRU to find the optimal combination,and output it through the softmax function.Experimental results show that the model with an F1 value of 73.7% has achieved better achievement in biomedical relationship extraction without relying on any artificial features and rules.Based on the Django literature mining system for brucellosis,the first two models are embedded in which to provide extraction methods of relevant gene,variant,and disease relationship for users.It will provides clues and theoretical basis for the research of brucellosis in Inner Mongolia Autonomous Region and even our country.
Keywords/Search Tags:Iterature mining, Named entity recognition, Realition extraction, BERE
PDF Full Text Request
Related items