Font Size: a A A

The Research And Application Of Correcting Method Based On Connected Components For Warped Chinese Document Images

Posted on:2016-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z D GuoFull Text:PDF
GTID:2298330467493347Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information-based technology, there are more and more digital image processing used in each area. OCR (Optical Character Recognition) as representative of artificial intelligence and machine vision is used in military, transportation, medical, office automation, education and many other fields. The application of converting paper books into electronic documents becomes more and more widely. The target images should be processed into ideal state without any distortion, however, different kind of distortion may exist in our processing images for various reasons, such as poor illumination, skew, perspective or warped distortion and so on. These will affect the result of OCR process, so the distortions in image is necessary to correct before the target images was recognized.This paper will commit to solve the problem of warped distortion in image. By the analysis of warped image and the characters feature, the summarize of many excellent research achievements from domestic and abroad in recent years, and compared the advantages and disadvantages of those methods, a fast correcting method has been proposed based on connected components for warped Chinese document images.In this paper, the latest development and current situation of warped correction field was introduced firstly, and many classic approaches was summarized and analyzed before the brief introduction of the method proposed in this paper. The second part is presentation and analysis of some frequently-used theories related about warped correction and processing based on connected components, such as graying, binarization, denoising, image cutting and connected components searching. The third part shows all possible method design which may be proposed in this issue, then all possible methods were analyzed for their feasibility. The fourth part gives the implementation of the solution proposed in this paper, every module has been introduced in detail, especially the binarization, denosing and image cutting, and the method of characters extraction and lines locating, that is Characters and Text lines Locate Alternately(CTLA), the last two module are the innovation of such solution. To get the best effective solution, every module has been optimized as more as possible. The fifth part is the experimental data and quality evaluation of the solution. Combined with the corrected image and its recognized data by OCR, the method proposed in this paper can be proofed as effective and practical. Lastly, all introduction and explanation indicate this solution can solve warped distortion in Chinese document image with nice time-consuming control, that shows its valuable practical applications.
Keywords/Search Tags:Chinese document image, Warped image, Connected components, Character segmentation, Characters and Text lines Locate Alternately
PDF Full Text Request
Related items