Font Size: a A A

The Research Of Off-Line Hand-Written Chinese Character Recognition Based On Sub-Stroke

Posted on:2009-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:T F JinFull Text:PDF
GTID:2178360245994635Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Off-line handwritten Chinese character recognition has a good applied prospect and high value of theory. From the point of application, many paper documents need put into the computer in high speed especially in library digitalization, envelope automatic management. Another example is some ID checking, bill and note automatic reading and so on. From the point of theory, the theory and technology of traditional pattern recognition can't take good performance in Chinese character recognition. The research in the field of off-line handwritten Chinese character recognition can help the development of the theory research in pattern recognition. Many subjects are involved in such as pattern recognition, image processing, digital signal processing, nature language understanding, AI, fuzzy mathematics, information theory, Chinese information processing, etc. For this reason, it can help the research of correlative subjects and the fusion of different subjects. So it is worth to research in this field.More than forty years have been passed since Casey and Nagy, two employee of IBM, wrote the first paper about print Chinese character recognition. Owing to hard work of many researchers, the on-line handwritten character recognition and off-line print character recognition have succeeded in our daily life with many practical products. Only off-line handwritten Chinese character recognition still can't get good result. It is called the most difficult field in OCR, especially based on structure is a challenge to researchers. It can't utilize a great amount of information such as the stroke order, writing pressure and other information. And handwritten characters vary not only in writers but also in handwriting styles. Out of all the difficulties, false connection and distortion are the most difficult. So how to solve these issues becomes the focus of research.In the thesis, the author mainly researched off-line handwritten Chinese character recognition based on stroke recognition that involved in pattern recognition technology and image processing technology, etc. Pretreatment have important position in the process of recognition. The author mainly researched the thinning and feature point extraction. The thinning can be divided into two types. In the first type, scan one time can be got single side margin. In the second type, scan multiple times can be get center line(frame). At present, in the method of second type, using template thinning can be got good result although there are some flaws, such as too many templates, more demand of main memory and slow speed processing, etc. In order to remedy these defects, some researchers offered the green ways of the thinning based on group. For this reason, the author offered the grouping thinning method that based on former research. Using this way, the thinning can be finished quickly and the feature point except of the inflection-point can be marked at the same time. Detailed speaking, namely scanned character image pixel one by one, judged the pixel type from the group numbers of the eight neighbor pixel, set type to the pixel of stroke image layer by layer (deleted a foreground pixel namely only set deleted mark to its type, not regarded it as background until next scan). Move in cycles, until all the foreground pixels had been marked. In the processing of thinning, the author marked the isolating pixel (0 branch), end pixel (1 branch), frame pixel (2 branches) and cross pixel (3-8 branches) of center line in character image that based on group number.The inflection-point extraction method can be divided into two types. In the first type method, extracted the inflection-point from local feature, namely from image pixel. Using this way, there are a great quantities operation and a lot of interference with noise. In the second type method, extracted the inflection-point from the entire feature, namely from the shape of the stroke as a whole, omitted to small curvature of frame. Using this way, there are simple, quick processing with a little interference of noise. The way suits to find the inflection-point of Chinese character frame, such as the farthest distance method. But using the farthest distance method, can only treat the single inflection-point stroke, can't treat the multiple inflection-point stoke, can't judge the number of the stroke inflection-points. The author improved the farthest distance method and offered far inflection-point method. Using this method, there are maintained good qualities of fast processing speed with a little interference of noise, and judged the number of the stroke inflection-points, found all inflection-points of multiple inflection-point stroke.The experiment proves that the grouping thinning method and the far pixel inflection-point method have some qualities, such as the fast processing speed, a little demand of main memory, simple algorithm and efficiency. Final, the author offered the system process of off-line handwritten Chinese character recognition with some flaws of modules, which remain to be solved. The author offered some algorithms and executive results. The related programs of the thesis are developed in Visual C++ 6.0 environment.
Keywords/Search Tags:Off-line Handwritten Chinese Character Recognition, Thinning, Feature Point Extraction, Stroke Extraction
PDF Full Text Request
Related items