Research And Implementation Of Key Technologies For Medical Document Text Recognition Based On Deep Learning

Posted on:2020-09-07

Degree:Master

Type:Thesis

Country:China

Candidate:W L Shao

Full Text:PDF

GTID:2428330575956460

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

With the development of the Internet,the combination of the medical industry and the Internet has become more closely related.With the continuous opening of policies,many medical-related Internet companies have emerged.In the Internet medical industry,the transmission of data is the basis for online communication.Medical documents are the main data in the medical industry.However,most hospital often do not provide electronic data of medical documents,which brings great difficulties for medical Internet data exchange.This paper takes the test sheet,one of the most common medical documents,as an example to explore the algorithm for the text recognition of test sheet.The commonly used optical character recognition tools on the market often require horizontal,less noisy text,and the test sheets' shooting conditions are often poor,so the general optical character recognition tools perform very badly on the test sheet.This paper divides the test sheets' text recognition task into three main steps according to the characteristics of the test sheets:table area detection,text detection,text recognition,and through three deep learning-based models.The main contributions of this article are:First,the table area detection is implemented using a semantic segmentation model.Based on the traditional semantic segmentation model U-Net,according to the characteristics of the test sheet's table area,the attention mechanism is firstly used to assign more weight to the features of the pixel points in the table area,thereby improving the accuracy of semantic segmentation.The traditional semantic segmentation model does not constrain the shape of generated segmentation image.This paper uses a generative adversarial network module to constrain the shape of table area detected by semantic segmentation model.The experimental results show that our improved model is better than the traditional semantic segmentation model.Secondly,this paper uses the sequence labeling model to achieve image page segmentation.The image page segmentation algorithm is used to segment the row and column data in the table area to detect the text area.Traditional rule algorithms that require a large amount of prior knowledge and is difficult to determine a uniform threshold.However,the sequence labeling model has solved these problems and improves the effect of text detection.At the same time,the paper also uses image feature extraction and text generation model to achieve text recognition.Finally,in order to verify the effectiveness of algorithms proposed in this paper,this paper links the processes of each algorithm to realize a medical document text recognition system.The results show that the proposed algorithms are feasible in practical applications..

Keywords/Search Tags:

Medical Document, Semantic Segmentation, Text Detection, Text Recognition

PDF Full Text Request

Related items

1	Text Detection And Recognition Based On Semantic Segmentation And Attention
2	Research On End-to-End Text Recognition In Document Images And Its Applications
3	Studies Of Scene Text Detection And Recognition Based On Deep Learning
4	Studies Of Scene Text Detection And Recognition Based On Deep Learning
5	Research On Deep Learning Based Text Detection And Recognition
6	Research On Deep-Learning-Based Text Recognition And Document Segmentation And Its Application
7	Research On Text Detection And Recognition In Complex Natural Scene Image
8	Research And Implementation Of Papers’ Layout Segmentation Algorithm
9	Study On Method To Automatically Analyze The Text Structure Based On The Relevancy Computing Of Text Content
10	Research On Key Technologies Of Text Line Analysis Based On New CNN Instance Segmentation Algorithm