Font Size: a A A

Research On Literature Based Entity Recognition And Relationship Extraction Of Drug Phenotype

Posted on:2021-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhangFull Text:PDF
GTID:2370330614470756Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The information extraction task of biomedical texts has received more and more attention,especially for drugs,mostly focusing on the study of the interaction between drugs and drugs in order to enrich the knowledge base and provide knowledge for future clinical research and drug research.Reserve,but relatively few studies on the relationship between drugs and phenotypes.Because it is not only drugs that cause a type of disease or uncomfortable symptoms of biological individuals,there are also other diseases and symptoms.For these diseases and symptoms,medically referred to as complications,sequelae,etc.Therefore,it is also necessary to study the relationship between such entities.In this paper,drug entities include various types of drugs,and phenotypic entities include diseases and signs.Through the study of drugs,phenotypic entities and relationships,the results can help researchers engaged in medicine to better grasp the clinical treatment process and take symptomatic measures and drugs in a timely manner.This paper mainly studies the realization method of the named entity recognition and relationship extraction task of medical texts in literature.For this type of information extraction task,the traditional method is in the order of named entity recognition,relationship extraction,and event extraction.Later,some people believed that relationship extraction can be carried out independently as a task,so there have been two independent models focused on the relationship extraction task.And two methods of using the joint model to complete the entity identification and relationship extraction tasks at the same time.This article has tried both methods.The third chapter explores the former method,using the current mainstream neural network method to try the relationship between the drug phenotype entity and the extraction.The fourth chapter is an attempt on the latter method.For the first time,a joint model of multiple head selection and adversarial training is used to simultaneously identify the relationships and entities in the medical text.The specific work content consists of the following three parts:(1)Bi LSTM-based drug phenotype entity recognitionBi LSTM combined with CRF is used to identify the medical phenotype entities in medical literature.By recognizing two forms of medical texts,medical records and literature,the effect of two different methods of obtaining word vectors on entity recognition was tested.The model is based on word level and uses BIO labeling strategy to combine entity name features with word segmentation features for training.For different data sets,different input features are used,and the effect of the pre-trained model under different features is tested.(2)Relation extraction method based on attention mechanismUsing the method of adding attention mechanism to BILSTM neural network to extract relationship.Using the word vectors obtained by CNN,the location features are spliced,and the relationship is extracted directly without identifying the entity.(3)Entity recognition and relationship extraction joint model for multi-head selectionA joint model is used to train two information extraction tasks.The advantage of this model is that it does not require additional POS labeling tools or other manual feature extraction,and extracts entities and relationships at the same time,rather than performing entity recognition first,then extracting relationships,and adding head information,and achieved good effect.
Keywords/Search Tags:entity recognition, relation extraction, medical text, multi-head selection
PDF Full Text Request
Related items