Font Size: a A A

Deep Learning-Based Methods For Biomedical Text Filtering And Information Extraction

Posted on:2021-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:H D AnFull Text:PDF
GTID:2480306467957869Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,the Internet has developed rapidly,and many valuable biomedical literature has gradually appeared on the network and has grown exponentially.These professional medical documents have undoubtedly promoted the development of medicine and brought a lot of convenience to biomedical researchers in medical research.At the same time,many patients share their symptoms and adverse reactions after taking drugs on social networks.These medical information can be used as data for medical research experiments.Both medical literature and medical-related posts are precious materials that can promote the development of medicine.However,the number of these medical texts in the network is almost uncountable,and it is difficult to process these text data with human resources alone.In addition,these texts are unstructured data,and the really useful information is hidden in the sentence and cannot be used directly.Therefore,how to quickly screen medical texts and extract structured entity relationships has become a popular research task in natural language processing.The research content of this article can be divided into two parts: screening and information extraction.The screening refers to a preliminary screening of these massive medical text data and remove texts with the same semantics or that are not related to medicine;the information extraction part extracts structured text from unstructured text data.The screening part has two tasks: biomedical text similarity and biomedical text classification.Use the siamese network as the main model structure in the text similarity task and adjust it through the attention mechanism,where the attention mechanism can amplify the medical information-related parts of the text features and reduce the impact of noise in the sentence;The biomedical text classification taskis to use the two Bidirectional Encoder Representations from Transformers(BERT)model as the pre-training model,and then the final result is obtained through neural network training.Two of the BERT models are the standard BERT model(Uncassed Base BERT)released by the Google team and the emotional BERT model we trained with sentiment data(Senti Word Net).The information extraction part is divided into entity identification and relationship extraction.The entity recognition task refers to identifying the entities that can be used as relationship objects in the sentence(such as drugs,diseases,adverse reactions,etc.);relationship extraction is to identify the relationships between entities in the text and classify these relationships.Before performing the relationship extraction task,you must know the entities in the sentence,and this is exactly the target of the entity recognition task,so there is a certain relationship between the two tasks.Therefore,this paper builds a joint learning model of entity recognition and relationship extraction,and uses the BERT model as the basic model of joint learningThis paper conducts experiments on the tasks of text similarity,text classification,entity recognition and relationship extraction.The experimental results prove that the models proposed in this paper are superior to the current mainstream methods in performance.
Keywords/Search Tags:Nature Language Process, Biomedical Text, Text Similarity, Text Classification, Informatio Extraction
PDF Full Text Request
Related items