Research On Automatic ICD Coding Of Clinical Records Based On Deep Learning

Posted on:2022-07-18

Degree:Master

Type:Thesis

Country:China

Candidate:C Y Wang

Full Text:PDF

GTID:2494306536476824

Subject:Engineering (Computer Technology)

Abstract/Summary:

PDF Full Text Request

With the popularity of electronic medical records,ICD coding has become a hot issue in the field of natural language processing.This thesis mainly studies the content in electronic medical records,and proposes SHAN model for ICD coding task and RESS model for sentence sampling task.At present,the existing ICD coding model is generally a black-box network,which cannot provide the corresponding reasons after giving the classification results;moreover,the existing model almost only uses a single part of the data in the medical records,ignoring the other information which is helpful for the classification.In order to solve the above problems,this thesis proposes SHAN model,which combines the information of disease description and doctor’s written diagnosis in medical records.Taking doctor’s written diagnosis as the basis of attention allocation,the hierarchical structure is used to complete the ICD classification task with higher performance,at the same time,more attention weights would be allocated to the sentences in the disease description which are more relevant to the specific diagnosis.And the attention allocation is used to show the reason of the classification.In the contrast experiment,the SHAN model shows excellent performance on the MIMIC dataset and the Chinese dataset.At the same time,it provides the interpretability of ICD coding results effectively.In the research of SHAN model,we find that too many sentences in the disease description will not match the perception unit function limitation in the classifier,and too many sentences will occupy a huge amount of storage space.To solve the above problems,this thesis proposes the reinforcement sentence sampler model(RESS),which uses the idea of random exploration of reinforcement learning.It cast the performance change of classifier as a guide to train a model,which is to judge the importance of sentences in the classification process.Through experiments,this thesis verifies that the RESS model can effectively reduce the number of useless sentences in the disease description,and reduce the degradation of classification performance caused by the reduction of the number of sentences as much as possible.To sum up,after deep learning experiment verification,this thesis proposes the Shan model using hierarchical attention,which can provide relevant sentences as interpretable basis while completing ICD coding task,for the problem of incomplete data usage and black-box problem;for the problem about too many sentences in the disease description,it proposes the RESS model using reinforcement learning,which can judge the importance of each sentence which has no manual labeling.And it can reduce redundant useless sentences effectively.

Keywords/Search Tags:

NLP, EMR, ICD Codes, Reinforcement Learning, Attention Mechanism

PDF Full Text Request

Related items

1	Intelligent Decision Making In Intensive Care Units Based On Reinforcement Learning
2	Adaptive Generation Of Fundus Images Reading Report Based On Semantic And Visual Features
3	Research On Automatic Focal EEG Identification Based On Deep Reinforcement Learning
4	Research On Recommendation Method Of Treatment Regime Based On Graph Embedding And Reinforcement Learning
5	Research On The Application Of Moxibustion Based On Deep Reinforcement Learning And Imitation Learning
6	Research On Extraction Method Of Medical Image Lesion Area Based On Reinforcement Learning
7	Research On Automatic Intensity-Modulated Radiotherapy Algorithm Based On Reinforcement Learning
8	Research On Pharmaceutical Patent Text Analysis Method Based On Reinforcement Learning
9	Research On Medical Self-diagnosis Based On Knowledge Graph And Deep Reinforcement Learning
10	Research On Early Diagnosis Of Alzheimer’s Disease Based On Machine Learning And Attention Mechanism