Font Size: a A A

Research On Medical Entity Recognition For Question And Answering Community Based On Deep Learning Method

Posted on:2022-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:M T ZhangFull Text:PDF
GTID:2504306557466054Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the increasing penetration of the Internet in daily life,various online medical and health communities have become one of the important channels for the public to retrieve,obtain and share medical and health knowledge.In recent years,the online medical and health community has attracted a large number of users to participate in it,accumulated a huge amount of online medical data,containing valuable medical and health knowledge,and has become an important data source for patient demand analysis,epidemic monitoring,adverse drug reaction detection and disease prediction.Medical entity recognition is the basis of information processing in the medical field,and has become an important research direction in online medical health information extraction and knowledge discovery.Compared with English,Chinese medical entity recognition mainly focuses on data such as electronic medical records and medical documents,but the attention paid to online medical and health community is far from enough.Most of the existing research on Chinese medical entity recognition uses traditional machine learning,failing to consider deep semantic information.On the basis of the deep learning model Bi LSTM-CRF,Self-Att-Med model that integrates external semantic features and introduces the Self-Attention mechanism is proposed.This model can capture more potential information to improve the medical entity recognition for online Q & A community.First,define medical entities as five categories of disease,symptom,body part,examination and treatment,adopt {B,I,O} labeling system,and use the YEDDA labeling tool to annotate entities.then,use the language model word2 vec to generate character-level vectors with semantic features from the unlabeled large-scale open domain corpus and small-scale medical domain corpus,respectively;furthermore,embed the two types of character-level vectors as features into Bi LSTM-CRF to generate LSTM-Wiki and LSTM-Med models,and conduct comparative experiments;finally,the Self-Attention mechanism is introduced on the optimal model.The ten-fold cross-validation experiment shows that the LSTM-Med model with character-level vectors generated from medical corpus achieves the best performance;the F-value of the Self-Att-Med is 0.72% higher than that of the Bi LSTM-CRF;in addition,experiments have found that there is a difference in the increase in F-measure between different corpora.This paper also analyzes the recognition performance and error results of various entities.
Keywords/Search Tags:Q & A community, Deep learning, Medical entity recognition, External semantic features, Self-Attention mechanism
PDF Full Text Request
Related items