Font Size: a A A

Research On Scene Low Quality Text Recognition Method Based On Two-stage Learning

Posted on:2023-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:P C LuoFull Text:PDF
GTID:2568306626481094Subject:digital media technology
Abstract/Summary:PDF Full Text Request
In recent years,natural scene text image recognition has become a popular technology,which is widely used in unmanned driving,criminal investigation and other scenes.However,the recognition effect and accuracy of low-quality text images of natural scenes affected by hardware equipment,acquisition and coding mode,focal length and aperture are usually not ideal.In view of the above problems,this thesis proposes a scene low-quality text recognition method based on two-stage learning.Starting from the semantic feature reconstruction level,the super-resolution model is used to learn and reconstruct the text information,enhance the expression ability of sequence features.The recognition is carried out through the detection and recognition model,and finally a parameter mapping with stronger generalization ability is formed with the corresponding label text,and finally improve the recognition accuracy of low-quality text images of natural scenes.The experimental results show that the average PSNR of this method is31.45 d B and the average SSIM is 0.89.Under different recognition difficulties,the LEV values of text recognition are 1.40,7.92 and 16.54 respectively.Compared with the existing recognition methods,the method proposed in this thesis has better recognition effect.The main contents of this thesis are as follows:(1)In view of the unknown characteristics of the degradation model of low-quality text image in natural scene,CLCN network is proposed and constructed on the basis of SRGAN as a super-resolution learning model.The closed-loop structure is introduced into the confrontation network,so that the generator is changed from univariate mapping to binary mapping,which further reduces the mapping space between LR image and HR image in super-resolution reconstruction,and obtains better semantic reconstruction performance.The experimental results show that,Compared with the latest Kernel GAN,DRN and other methods,In the reconstruction results of CLCN network proposed in this thesis,the average PSNR is increased by0.33~1.92 d B and the average SSIM is increased by 0.02~0.07.(2)The super-resolution reconstruction model is further optimized,the training focus is on the discriminator,the wavelet transform is introduced to increase the feature dimension of the discriminator,and the generator is excited to make the high-frequency information of the reconstructed image more prominent.At the same time,the residual structure is modified and replaced appropriately to balance the problem of increasing model complexity caused by closed-loop structure.The experimental results show that,With the increase of iteration times,the convergence speed of this method is 20% ~ 40% higher than that of other methods.(3)Aiming at the characteristics of low-quality text image scene fusion of natural scene,integrating target positioning and recognition,TDRN network is proposed and constructed as the learning model of detection and recognition.The sequence information is reused by sharing convolution features to enhance the utilization rate of semantic information.At the same time,the interpolation kernel function of the offset adjustment module is replaced to avoid the loss of characteristic pixels in the interpolation process.CLCN and TDRN are combined to form a two-stage learning model.The experimental results show that the LEV recognition index of this method is 1.05,1.96 and 1.40 lower than the latest STN-OCR respectively.
Keywords/Search Tags:Deep learning, image super-resolution, scene text recognition, low quality text image, closed loop countermeasure network
PDF Full Text Request
Related items