Font Size: a A A

Automatic Coding Of Medical Concept Based On Text Classification And Matching

Posted on:2022-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y H HuangFull Text:PDF
GTID:2518306569494534Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The coding of medical concept is to assign the codes corresponding to the standard medical terminologies for the medical concepts in the clinical text.Due to the large amount of codes and the high cost and low efficiency of manual coding,automatic coding of medical concept is of great value in research and application.Therefore,this thesis studies the automatic coding methods of medical concept based on deep learning.The current automatic coding methods of medical concept based on deep learning are mainly divided into the following two categories: one is the automatic coding methods of medical concept based on text classification.Affected by the classification label space,this type of methods are sensitive to the number of standard medical terminologies contained in the terminologies dictionary;the other is the automatic coding methods of medical concept based on text matching.This type of methods use sampling technologies and are not sensitive to the number of standard medical terminologies contained in the terminologies dictionary.When the clinical text contains multiple medical concepts,the automatic coding methods based on text classification model the automatic coding of medical concept as multi-label classification.The traditional methods have insufficient ability to express the relevance between labels.To solve this problem,this thesis proposes an automatic coding method combining sequence generation and hierarchical dictionary.This method models text classification as sequence generation,and introduces the hierarchical relationships between codes in the terminologies dictionary by the knowledge representation algorithm Trans E.This model gets F1-score of 0.797 2 on the Chinese dataset.The automatic coding methods based on text matching match clinical text containing medical concepts with standard medical terminologies.Aiming at the problems of the existing text matching models with many parameters and difficulty in training,this thesis proposes an improved text matching model,and compareds it with the existing text matching models.The automatic coding methods of medical concept based on text matching match the clinical text containing medical concepts with each standard medical terminology separately,ignoring the relationships between terminologies.For this problem,this thesis proposes an automatic coding method of medical concept based on machine reading comprehension.The method contains two phases: recall and selection.In the recall phase,this method samples a number of standard medical terminologies as candidates for the clinical text containing medical concepts,and designs a sampling method that combines different granularity edit distances to calculate similarities for the NCBI dataset.In the selection phase,this method converts automatic coding into multiple-choice machine reading comprehension: clinical text containing medical concepts is the article and candidate terminologies are options.This method uses option-option interaction module and article-option interaction module to model the relationships between terminologies and terminologies,the relationships between text and terminologies,and uses the gating mechanism to fuse the information of the relationships.This model gets F1-score of0.819 2 on the Chinese dataset and accuracy of 0.899 0 on the NCBI dataset.
Keywords/Search Tags:medical concept, automatic coding, deep learning, text classification, text matching
PDF Full Text Request
Related items