Font Size: a A A

Low Quality Printed Character Segmentation And Recognition

Posted on:2015-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:Q SunFull Text:PDF
GTID:2268330425987762Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
The technology of machine printed characters recognition has been becoming more and more mature. More and more applications of machine printed characters sprung up. However, the recognition of the low quality of printed characters is still the bottleneck of OCR. This paper analyzes the problems which will happen when recognizing the low quality characters. In this paper a printed character recognition process is designed, the process considers the problems of low quality characters and the features of the experimental subject. In this paper the license plate images and the pictures of the crown word number area are used to validate the process.This paper uses the global threshold which is calculated by the Otsu algorithm to decide some candidate of thresholds. The best threshold is chosen from the candidates by the evaluation score which is computed by the evaluation criterion of the connected domain analysis of the continuous adjacent characters. This method of choosing the optional threshold is used in the stage of the character segmentation. The results of the character segmentation validate this method.The effect of using a single character segmentation strategy for the low quality printed characters is limited. In the stage of the character segmentation, a two level character segmentation strategy is used. The connected domain analysis of the continuous adjacent characters is used in the first level of the segmentation. If the calculated evaluation score of the analysis is greater than the pre-defined threshold, the second level character segmentation based on the analysis of the projection of the character image is used. The results of the character segmentation indicate the result of using two level character segmentation strategies is better than a single character segmentation strategy.There exit many interference factors in the low quality printed character pictures. This leads to the recognition correctness of the similar characters is less than the common characters. To reduce the error of the similar characters, a two-stage recognition method based on the sensitive area of the similar characters is introduced. The confidence coefficients are calculated by the genetic algorithm. The experiments demonstrate that this method can reduce the errors of the confusions which are caused by the similar characters.Using the method introduced in this paper the recognition correctness of2021car license plates reaches82.1%, which has the3.3%improvement compared to the single strategy. The correctness of5113crown word number pictures reaches93.7%, which has the1.5%improvement compared with the single recognition strategy.
Keywords/Search Tags:Machine printed, Low quality, More thresholds selection, Two level segmentation, Similar characters, Genetic algorithm, two-stage recognition
PDF Full Text Request
Related items