Font Size: a A A

Research And Application Of Chinese Text Image Distortion Correction

Posted on:2015-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2268330428972675Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Along with the development of the digital information, the birth of OCR, with its peculiar function of converting printed text into the electronic document, greatly reduce the burden of the language school staff and win the favor of the library government department. However, OCR also has its disadvantages, for example the character recognition rate of distorted images is low, got from thick books with digital device.To solve this problem, this paper introduces the distorted image correction methods in recent years both at home and abroad, analyses the good and bad between the connected domain mark method and text line extraction method, and then puts forward the text image distortion correction method based on multiple text lines refactoring.Firstly this paper introduces the current situation of the distortion correction research. Secondly simply tells the basic needed knowledge of image processing in the research, such as graying, binarization, image segmentation, etc. Thirdly illustrates the specific roles of these algorithms in the research and tell the characteristics of the distorted images. Then analyzes the advantages and disadvantages of connected domain method. Fourthly carry on the overall design of the research and analyze its feasibility. This part is the main research part, tells the research of the core algorithm and function. The fifth part is the realization of each function, including image preprocessing, image expansion, text line extraction, e image refactoring and border processing.the last three function is the core of the research. Image preprocessing is used for image processing system image, the expansion is used to blur relationship between words, text line extraction function is used to obtain sample values of each curve, referring to the characteristics of expansion areas and using the improved template search method. Approximate curve is computed by least square fitting. According to the set of rules for image reconstruction, the image refactoring function reconstructs the distortion image. According to a large number of statistical analysis of the distorted image text line, the text line curve between the left and right pages are opposite. Boundary processing module extracts the boundary words of each line and make them consistent with the body size. Eventually joint together with the edge image to get the final corrected image.Finally based on the implemented and testing standards, using the HW-OCR image recognition rate as an evaluation standard, compare the effect of the image before and after the correction. And also compare the good and bad between the method of using connected domain mark and the single text line correction method. The experimental results show that, the design scheme of this research has good practical value.
Keywords/Search Tags:text image, distortion correction, text line, image refactoring, joint
PDF Full Text Request
Related items