Font Size: a A A

Research On Text Detection And Recognition Of Medical Test Sheets Based On Deep Learning

Posted on:2024-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:Q H ShenFull Text:PDF
GTID:2544307115495504Subject:Electronic Information (Control Engineering) (Professional Degree)
Abstract/Summary:PDF Full Text Request
In recent years,with the concept of "intelligent medical care",medical diagnosis and other related technologies based on artificial intelligence have become a hot topic of research nowadays.In clinical diagnosis and treatment,a large number of medical test sheets are produced,and patients often choose to take images of the medical test sheets to save them in the face of paper labs that are not easy to save.However,images are unstructured data,which are not conducive to further dissemination and utilization of information.If its content can be accurately parsed out with the help of deep learning technology,it will help the in-depth promotion and application of intelligent medical activities such as online consultation and cross-province transfer.However,the conventional OCR algorithm detects and recognizes poorly due to characteristics such as inconsistent layout,diverse character types,and inconsistent medical terminology in medical test sheets.In response to the above problems,this paper carries out research on the detection and recognition of medical test sheets text and the standardized matching algorithm of medical terms based on deep learning,and the main research contents and results are summarized as follows:(1)Design of medical test sheets text detection model named SFE-DBNet: Aiming at the difficulties of text detection in medical test sheets such as various formats,diverse text types,and uneven horizontal text,a text detection algorithm based on SFE-DBNet(Spatial Feature Enhancement DBNet)model for medical test sheets is designed.Firstly,the model uses the Res Next50 + attention mechanism as the backbone network to enhance the feature extraction ability.Secondly,the FPEM_FFM module is used to extract and fuse multi-scale features to enhance the feature extraction capability of text at different scales.Finally,BLSTM is introduced to enhance the extraction of text sequence features from feature maps,improving the detection ability of text content with different word spacing.Comparative tests were conducted on medical test sheets data sets,and the results showed that the algorithm model has significant advantages(2)Design of medical test sheets text recognition model named RE-CRNN: Aiming at the recognition difficulties such as diverse characters and different lengths in the text of medical test sheets,a RE-CRNN(CRNN for Based on Res Net and En-CTC)algorithm model for text recognition tasks of medical test sheets is designed.The model uses CRNN as the baseline model and deep residual network as the feature extraction network to enhance its feature extraction ability.Secondly,in the prediction module,En-CTC is used instead of ordinary CTC to solve the peak effect existing in CTC training.Experiments have been conducted on public datasets and medical test sheets text datasets,the results show that the text recognition performance of this algorithm has a large improvement over the baseline model and has a significant advantage in the task of lab slip text recognition.Finally,the text of detected and recognized medical test sheets are exported to a spreadsheet by text location structuredly.(3)Design of terminology standardization algorithm model named MR-BERT: Aiming at the problem that the terminology description form on the medical test sheets form is not uniform,a standardized matching model of medical terminology named MR-BERT(BERT with Multiple Recall)is designed.The model first uses four query methods: standard word query,historical query,information enhancement query and direct query to perform multiple recall,and then builds a number prediction and candidate standard word matching model based on BERT to output the final standardized results.This model achieved a ranking of top 1% in the evaluation of the third clinical terminology standardization task dataset at the China Health Information Processing Conference(CHIP),confirming the effectiveness of the algorithm model in related tasks.
Keywords/Search Tags:Medical test sheets, Deep learning, Text detection and recognition, Terminology standardization
PDF Full Text Request
Related items