Named Entity Recognition In Medical Field Based On Deep Learning Of Chinese

Posted on:2021-01-18

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Luo

Full Text:PDF

GTID:2404330611968005

Subject:Control Engineering

Abstract/Summary:

PDF Full Text Request

In the current information age,a large amount of text information is recorded in various fields.In the medical field,there are a large number of unstructured text data such as electronic text medical records,drug descriptions,and disease records written by doctors.Knowledge and experience cannot be formed through numerous messy natural languages for mining and analysis.For subsequent mining and analysis tasks,it is extremely important to extract data with high efficiency and accuracy.Manual extraction is timeconsuming,laborious,and costly.Different people’s perceptions and standards will lead to differentiating results;using rules to extract generalization and poor robustness,there are many polysemy and ambiguity in the Chinese language.There are technologies that require a large amount of annotated data,and it is not good to text-represent a large number of professional terms that are not commonly used in the medical field.In view of the above existing limitations and technical difficulties,based on the above background,this paper mainly carries out the following research work:First,review,study and research mainstream text extraction algorithms,understand the advantages and disadvantages of traditional dictionary library extraction,statistical algorithm extraction,and neural network-based deep learning methods,and combine the actual problems in the medical field to name Design and optimize the entity recognition model architecture.Secondly,in the face of the particularity of the medical industry,a highly professional data set is needed.After finding the public data set CCKS2017 in the medical field,I feel that it cannot prove that it is indeed effective in the actual medical field application environment.Automatically download the text of drug instructions,and perform cleaning and preprocessing.After the secondary development of the Harbin Institute of Technology’s open source labeling tool,a small amount of manual labeling is performed to obtain unstructured data in the medical field as a data set.Thirdly,the exploratory data analysis(EDA)is performed on the data set.After analyzing the characteristics of the drug data set,labels such as prepositions and segmentation words are added,and a large number of long sentence entity problems are solved by using labeling techniques.Finally,combined with the actual situation in the medical field,a model structure for extracting entities for the medical industry is proposed,using a transfer learning method,a BERT parameter model trained on a large number of data unsupervised,and a labeled medical proprietary noun library The data is pre-trained to obtain text representations with higher quality.Since this embedding is predicted,it can solve the polysemy phenomenon in Chinese,and then combined with the deep learning model Bi-GRU in CCKS2017 and a small number of drug instruction data sets To verify it,and design 5 other deep learning models for comparison.Through experimental analysis,the results show that the models proposed in this paper have achieved the best results in F1-Score evaluation indicators.In summary,in the text extraction task in the medical field,this paper uses transfer learning to obtain a higher quality text representation through pre-training and build a deep learning model,which improves the accuracy of extracting medical entities.

Keywords/Search Tags:

Named entity recognition, Text representation, Pre-training, Entity labeling in the medical field, Named entity recognition in the medical field

PDF Full Text Request

Related items

1	Research And Implementation Of Chinese Named Entity Recognition Algorithm For Medical Field
2	Named Entity Recognition For Medical Field
3	Medical Text Named Entity Recognition Based On Improved Sequence Labeling Model
4	Research On Chinese Named Entity Recognition In Medical Field
5	Deep Learning Based Medical Named Entity Recognition
6	Research On Named Entity Recognition Technology For TCM Field
7	Research On Named Entity Recognition And Entity Relationship Extraction Of Medical Data Text Based On Attention
8	Research And Realization Of Medical Case Automatic Generation Based On Named Entity Recognition
9	Construction And Research Of Chinese Electronic Medical Record Named Entity Recognition Corpus
10	Research On Named Entity Recognition Of Chinese Electronic Medical Records Based On BERT