Font Size: a A A

Text Image Recognition Based On Improved Binarization Algorithm And Tesseract-OCR Engine

Posted on:2024-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z GuoFull Text:PDF
GTID:2568307115495134Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the continuous development and progress of technology,character recognition technology has been widely used in all walks of life to promote the development of intelligence.However,when recognizing text images,there are still many factors that will affect the overall recognition accuracy and efficiency.For example,in the process of collecting text images,you may encounter problems such as insufficient lighting conditions,noise interference,and tilted angles when shooting,which will affect image recognition;Low.In order to effectively reduce these effects,it is necessary to preprocess the text image and optimize the text recognition system itself.Therefore,this paper studies an improved binarization algorithm and improves the traditional Tesseract-OCR engine to improve the recognition efficiency of text images in complex situations.By improving the traditional binarization algorithm,the influence of insufficient illumination on image recognition can be reduced,and the processing efficiency can be improved while obtaining better binarization processing effect.Through the improvement of the Tesseract-OCR engine,the recognition efficiency and accuracy of complex long texts are effectively improved.The main work of this paper is as follows:(1)Aiming at the image binarization technology in the preprocessing technology,this paper proposes an improved binarization algorithm.The algorithm starts from two aspects of reducing the influence of artifacts and improving the operation speed of the algorithm,so as to solve the problems still existing in the traditional binarization algorithm.This paper first improves the traditional Niblack algorithm,and improves its adaptability by adaptively changing the size of the neighborhood window and dynamically adjusting the correction coefficient.At the same time,this paper adds an integral map to improve processing efficiency,and then combines it with the traditional global threshold algorithm OTSU algorithm to reduce artifacts and obtain better segmentation results.(2)For the optimization of the text recognition system,this paper proposes two optimization measures after understanding the working principle of the Tesseract-OCR engine.First of all,this paper proposes to use the Swish activation function to replace the original default activation function,so as to better activate the neurons in the neural network.Secondly,this paper uses the Luong attention mechanism in the model to help the OCR system adaptively adjust the attention weights to better recognize complex long texts.(3)In order to verify the effectiveness of the two improved methods in this paper,comparative experiments were carried out in this paper.For the improved binarization algorithm,the processing time has been significantly improved,and it has better results than the traditional binarization algorithm in the commonly used binarization evaluation indicators.For the improved Tesseract-OCR engine,this paper did an ablation experiment and found that the accuracy,recall rate and F1 value were all improved compared with the original model.Finally,this paper combines the two improved methods into one method for experiments,and finds that the recognition rate of the Tesseract-OCR engine can be further improved,and the F1 value obtained is 5.23%higher than that of the original model.
Keywords/Search Tags:text recognition, image processing, binarization algorithm, Tesseract-OCR, attention mechanism
PDF Full Text Request
Related items