Drug Name Recognition Based On Partial Labeling And Reinforcement Learning

Posted on:2022-02-23

Degree:Master

Type:Thesis

Country:China

Candidate:M T Qu

Full Text:PDF

GTID:2480306350453384

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Medical drug name recognition is the basic work of relation extraction and event extraction in drug-related tasks,which has important research significance in the field of biomedical science.Most of the existing drug recognition methods are based on the guided machine learning method,which often requires a large number of manually annotated data as training data.However,due to the limited manual labeling data,new drugs emerge in an endless stream,which restricts the performance of drug name recognition model.In this thesis,the composition characteristics of drug names are analyzed,and a neural network model based on character embedding and drug name prefix and suffix embedding is proposed to improve the semantic expression of drug names.At the same time,distant supervision,partial labeling learning and reinforcement learning are used to expand the training data and improve the performance of drug name recognition.The main research contents of this thesis include the following points.Firstly,this thesis focused on the composition characteristics of drug names,summarized the prefix and suffix dictionary of drug names,and added the prefix and suffix embedding and character embedding in the word embedding layer to improve the semantic expression ability of drug names.Drug name has some obvious characteristics in the form of word formation,such as the same prefix or suffix of drug name.In this thesis,prefixes and suffixes embedding and characters embedding are added to the embedding layer of the neural network model to capture its word-formation characteristics and improve its semantic expression,thus improving the recognition performance of drug names.Secondly,a hybrid training method based on manual annotated data and remote supervised data is adopted to improve the robustness and performance of the model.In order to suppress the over-fitting problem in the training process,part of distant supervision data is added to the manual labeled data to train the recognition model,so as to improve the robustness of the recognition model.At the same time,in the training based on extended data,part of manual annotation data is added to the distant supervision data to guide the model parameters to converge in the correct direction.The experimental results show that both the character embedding and mixed data training methods summarized in this thesis can effectively improve the performance of the model.At the same time,the model can effectively identify some new drug names that have not yet been included in the dictionary,which shows that the model has good generalization ability.

Keywords/Search Tags:

Drug name recognition, Partial labeling learning, Remote supervision, Reinforce learning

PDF Full Text Request

Related items

1	Registration,Recognition And Labeling In 3D Point Clouds
2	Cas-GAN: An Approach Of Dialogue Policy Learning Based On Gcn And Rl Techniques
3	Study On Water Body Recognition Based On Deep Learning In Remote Sensing Image Of Cold And Dry Areas
4	Research On 3D Cell Labeling For Machine Learning
5	Learning And Learning To Discretize Partial Differential Equations
6	Research On Remote Sensing Snow Cover Recognition In Xinjiang Based On Deep Learning
7	Research And Implementation Of Plant Leaf Recognition Based On Deep Learning
8	Research On Application Of Deep Learning In Lithology Recognition Of Oil And Gas Reservoir
9	Remote Sensing Classification With Imbalanced Datasets Of Urban Land-cover Using Machine Learning
10	Research On Recognition Method Of Brain Cognitive State Based On Transfer Learning