Font Size: a A A

Handwritten Character Recognition Method Based On Natural Strokes Segmentation

Posted on:2015-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:J HuangFull Text:PDF
GTID:2308330479989761Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In current handwriting-recognition system, character feature extraction for inputted handwritten text mainly combines with the characteristics of different languages, and realizes using the statistical method. However, it is not easy to get precise recognitions for Chinese characters, in which there are not only plenty of categories, but also many similar words. Furthermore, the complexity of its structure and different writing styles also lead to large deformation in handwritten Chinese characters of individuals. The current main stream for Chinese character feature extraction is using the base units combined with the words’ text features derived from efficient statistical characteristics data. The base units include unique Chinese character structure features, such as horizontal line, top-down vertical line, left-downward slope line and short pausing stroke. Although the base unit class can be well defined, it is still difficult to extract the text structure for handwritten Chinese characters because of various factors. In general, extracting proper text structure is vital to the feature character recognition system, motivating researchers to work on the issue of how to extract general and stable structural features from handwritten text.This main work of this research includes developing a recognition method based on natural stroke segmentation, which has taken the diversity of handwritten text font and writing habits of different users into consideration. The proposed method involves three phases: splitting collected handwritten text natural strokes, classifying the split unit and training the classification method. On the first phase, we get the splits unit using three language independent break rules, which are based on the coordinates of the point slope, curvature and the mix of previous two rules, respectively. Afterwards, the split units are classified according to the basis unit class, there are two classification methods which are manually defined or approached by clustering algorithm. Finally, realizing the character recognition based on convolution neural network. In addition, we add fuzzy features for each text to reconstruct its characteristic matrix and then classify, since that the characteristic matrix got from the split units is too spars e.The training data set used in this paper contains SCUT-COUCH2009 collected by the South China University, HIT-OR3 C of Shenzhen Graduate School of Harbin Institute of Technology, as well as CASIA-OLHWDB1. The experimental data set is chosen from Harbin Institute of Technology Shenzhen Graduate School HIT- OR3 C following 20 sets of documentation. The empirical results presents the handwritten text representation of high-level feature directly used in classification and recognition can not achieve to the desired results. But the results by using the recognition method of CNN directly in the text original feature before and after segmentation show that, the proposed model based on natural stroke segmentation is more general than those without the model based on nature stroke and it reduces 27.38% in error rate of word and gets 95.28% in the rate of correct choice. However, there is still more work need to do to develop better split rules in order to achieve the expected results.
Keywords/Search Tags:handwriting recognition, natural stroke, fuzzy feature, convolution neural network
PDF Full Text Request
Related items