Font Size: a A A

Research On Text Detection And Recognition For Medical Insurance Images

Posted on:2020-09-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q B XieFull Text:PDF
GTID:1364330599461809Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Text information in images contains rich and accurate high-level semantic information,which is the key element to understand images.Text detection and recognition technology in images has been widely used in many fields,such as automatic driving,electronic archives,intelligent medical treatment,etc.It has attracted great attention from academia and industry,and has become a hot spot of current research.Despite the achievements of current research,image text detection and recognition in the medical insurance industry still faces great challenges.Firstly,the image resolution used in medical insurance is low and the image noise is high.Medical insurance image has been compressed many times in the process of collection,resulting in low resolution.In the process of using medical insurance images,their copies will be reproduced again,resulting in greater image noise.It is difficult for low resolution and high noise images to detect text and non-text regions correctly.Secondly,medical insurance image table line interference is large.Unlike natural scenes,medical insurance images have a large number of tables,in which the interval between text,numbers and table lines is small.When detecting,table lines and text are often connected together,resulting in recognition errors.Thirdly,medical insurance images need character level detection.The existing detection algorithms fail to detect single character effectively because of the serious omission of detection.Fourthly,medical insurance image contains a large number of medical invoices and lists.The image contains one or more color stamps,which results in the decline of the accuracy of the occluded part of the text recognition.Single recognition model can not effectively solve the recognition of single words and sequences.In view of the above problems,the text detection and recognition of medical insurance image is studied by referring to the latest progress in target detection,contour detection,pattern recognition,in-depth learning and other fields.The main research work is as follows:1)Aiming at the problem that text frame detection can not be effectively carried out due to low resolution and high noise of medical images,a detection model based on feature attention mechanism in pixel classification is proposed when medical insurance image text detection is carried out by image segmentation method.The model consists of two parts:detection network and text box generation.The detection network applies the feature attention mechanism to the feature processing stage,which makes the network retain asmuch important information as possible in the process of feature processing and promotes the model to learn more accurate features.In the text box generation part,a text area detection algorithm based on pixel centerline is proposed.The text box results are calculated from the output of the detection network,so that the text box detection of medical insurance images in low resolution and high noise environment can be completed.Experiments show that the model can achieve better text detection results in medical insurance images with low resolution and high noise.2)In order to solve the problem of table line interference in medical insurance images,a CEN model consisting of two sub-networks of feature coding and feature sampling is designed.In the feature coding sub-network,a feature enhancement module is designed to retain the text contour information.The module can effectively process the shallow features of the image and distinguish the text from the table lines.In feature-sampled sub-networks,deconvolution filter is used instead of traditional up-sampling operation to expand the scale of deep features extracted in the previous stage,which can effectively retain as much edge information as possible while increasing the feature receptive field.Finally,the CEN model post-processes the output of the network to get the exact text detection results of medical insurance image tables.Experiments show that the model can achieve better text detection results in medical insurance form images.3)In order to detect single character effectively,a detection method combining full convolution neural network with ant colony algorithm is proposed.Firstly,full convolution neural network is used to detect text thermal map,and then single character is detected on the basis of text thermal map.In word detection,an optimized ant colony algorithm is proposed.The algorithm initializes the position of ant colony using gradient information of image partition,optimizes the process of searching edge of ant colony according to the characteristics of text object,designs an adaptive pheromone dissipation coefficient to process ant colony movement,and effectively solves the traditional ant colony algorithm.The method converges slowly and is sensitive to noise.Finally,the single character information in medical insurance image is segmented accurately.Experiments show that this method can achieve better single character detection effect in medical insurance images.4)In order to effectively recognize the cover text and the small radian text and words in the seal,a multi-color space and multi-model integration recognition algorithm is proposed.Firstly,RGB channels are separated,the most suitable color space is selected for recognition,and the recognition is compensated and fused by other channel information.Secondly,character-based recognition and sequence-based recognition are performed on the selected color channel,and then the recognition results of the two models are fused.In order to recognize text with small radian,a recognition model based on four-line attention mechanism is proposed,which can effectively solve the text recognition of small radian seals in medical insurance images.Experiments show that the ensemble algorithm can effectively recognize seal-covered text,single character and small radian character in medical insurance image.
Keywords/Search Tags:Medical Insurance Image, Text Detection, Text Recognition, Feature Attention, Feature Coding, Ant Colony Algorithm, Model Integration
PDF Full Text Request
Related items