Font Size: a A A

Study And Realization On Printed Chinese Character Recognition System

Posted on:2012-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:J N LiuFull Text:PDF
GTID:2178330335454707Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
The main application of printed Chinese character recognition is to automatically input Chinese characters into computer, and to exchange information between human beings and computer. Chinese differs from Western language in several aspects. For instance, Chinese is ideogram while Western language consisted of simple and fixed characters, which makes it less complex for character recognition; Chinese is characterized by large amount and complicate structure which poses a key obstacle to character recognition. However, it is still of great importance to research into Chinese character recognition for realizing informatization.We adopt 3755 Chinese characters from the first level frequently used words of national standard GB2312-80 as a dictionary, and make a detailed explanation respectively on three key part of Chinese character recognition:preprocessing, feature extraction and pattern matching. We have made a comprehensive research on the three key part mentioned above, and made certain improvements. Compared with the original method, ours appear to be much more efficient. The paper is organized as follows:(1) The fundamental purpose of image pre-processing is to handle problems arising from poor quality of print and distinctions among different fonts, and facilities for feature extraction and recognition. In preprocessing, binaryzation can erase the noise caused by photoelectric conversion, and make a sharper contrast between backgrounds and targets; layouts analysis sets apart special pixels and target pixels and thinning of characters concentrate features. This paper principally focus on thinning of characters, we introduced a mathematical morphology based hit or not algorithm to improve characters thinning. Experimental result shows the improved method retains more key features of Chinese character and makes a better performance on Connectedness.(2) The paper has made a concrete analysis on different types of common used Chinese characters in feature extraction section, for example, simple or traditional Chinese words feature, Connectedness and close area, fringe and grid features. On the basis of a full study on advantages and disadvantages of these algorithms, we have made an improvement on simple or traditional Chinese feature extraction, and come up with a new feature which is based on stoke-crossing and energy density value. This improvement remarkably enhanced our system's efficiency. (3) Due to the inherent defect of single Chinese character classifier, classify result can hardly reach a desired one. Under this condition, combination of several classifiers which takes advantage of different classifiers is necessary. In this paper, a maximum optimizing combination algorithm that gives consideration to both recognizing rate and speed is proposed, it integrates rate and speed through cost function, and finds the optimal junction, consequently, the performance of whole system can be boost.
Keywords/Search Tags:Printed Chinese character recognition, pre-processing, feature extraction, multiple classifiers combination
PDF Full Text Request
Related items