Font Size: a A A

Research On Technology For On-line Handwritten Chinese Character Recognition

Posted on:2010-10-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Q LvFull Text:PDF
GTID:1118360302971166Subject:Computer applications
Abstract/Summary:PDF Full Text Request
As a friendly human-computer interaction, on-line handwriting is always welcomed by people for it has the characteristics of no need for learning and memorizng. Especially in recent years, On-line Handwritten Chinese Character Input is even required a higher performance along with the popularity of consumable electronics devices. As the method of solving the on-line handwritten Chinese character input problem, On-line handwritten Chinese Character Recognition is being always concerned and researched as a hot topic. But On-line Handwritten Chinese Character Recognition is a complex "super-multi-classes" problem in the field of pattern recognition. It is an unavoidable problem in the on-line handwriting input system because Chinese characters have multi-categories and complex structures, and will vary in shape in the writing process. For many years in the past, lots of new methods and technologies were applied in the On-line handwritten Chinese Character Recognition system, and many successes to some extent have been achieved. However, the key algorithms and architecture aren't reported completely yet. Researches on On-line handwritten Chinese Character Recognition technology have a broad market prospect and a profound theoretical significance.In this thesis, some works have been carried out basing on the studying of the On-line Handwritten Chinese Character Recognition problems, which can be divided into four main aspects: a segment extraction algorithm based on inner angle and polygonal approximation; segments combining based on the finite state machine; a radical-based on-line handwritten Chinese Character Recognition system; using support vector machines to identify the radicals and similar characters.It is instable to extract the structures of the Chinese character and the impact of the identification results for strokes may be over-consolidated or lack of consolidation. To solve these problems, a segment extraction-combination algorithm was developed on the basis of polygon approximation and finite state machines for On-line Chinese Character Recognition is presented.With this method, the point with the smallest interior angle which is less than the given value is detected and the whole stroke is split into two adjacent curves by this point, which is called as a cut-off point or an inflexion. To each of the two curves, the same step is performed to detect the cut-off points respectively. The same operations are performed iteratively until the smallest interior angle in all the curves is larger than the given threshold value. All the cut-off points and the start-end points compose the stroke and every pair of adjacent points constructs a segment. After segments have been extracted, Finite State Machines is used to check whether the adjacent segments need to be combined thus redundant segments can be reduced. Experiments proved that this method has the advantages of less computing complexity and better approximating effect than other methods.Some researches on the On-Line Chinese Character Recognition algrithmes based on radicals have been done in this thesis. In our system, Chinese characters are divided into five types: surrounded type, semi-surrounded type, left-right type, top-bottom type and single type. After the type of the character is determined, the character is split until each of it is a single type. The character thus is described with a radical attribute string from which we can get the match result. The system is both stable and efficient, and has achieved very good recognition effect.The exclusive method is used to determine the types of Chinese characters. First, the character is determined whether it is a surrounded type character. that is, if a "口"shape can be detected in a peripheral part of a Chinese character, then it is, and it may be required to be split into two parts or to be recognized depending on the number of stroked of the character. Otherwise, the character should be checked so that it can be decided if it belongs to the semi-surrounded type. if long strokes detected in the character meet certain structural features, then it is a semi-surround type character and the type of the radical is detected, so the character is split or recognized depending on the number of stroked of it. When the character belong neither surrounded type nor semi-surrounded type, the clustering algorithm is adopted in next step .In order to ensure the accuracy of classification, the results are verified by some method and the type is finally determined. No matter the character belong to either the left-right types or the top-bottom type,the character should be split until it cannot be split any more. In this way, each part of the extracted result is a radical. In order to ensure the accuracy of recognizing radical, we utilized local sorting algorithm to sort the segments in the radical. Finally, the character is split as a cluster of radicals based which we can get the match result.Support Vector Machine (SVM) is a statistical learning method, with the global optimal and learning generalization ability. In recent years, it is widely used in pattern recognition. This thesis has done some studies and discussion about the support vector machine applied to on-line handwritten Chinese character recognition. First, we extract statistical features from the extracted segments, and then SVM is used to learn the features or recognize the radical. Experiments show that SVM can be applied to effectively identify the radicals to some extent, and the advantages of SVM are fully illuminated. Aiming at the features that Chinese characters have complex structures and most of them are similar, we use SVM to collect partitial space characteristics of similar characters and recognize them. First, we analyse and ascertain the different characteristics of similar characters, then use the Support Vector Machine to learn and recognize and finally obtain the distinction results of the similar characters. It's proved by experiments that the total recognition rate of the Chinese characters is increased after the SVM is used to distinguish similar characters.Some useful exploration in the field of On-line Handwritten Chinese Character Recognition is carried out in our thesis. The next step should be committed to improving the speed of the Chinese character recognition so that it can be used in the embedded system.
Keywords/Search Tags:Chinese characters recognition, on-line handwritten, structure feature, polygon approximation, support vector machine
PDF Full Text Request
Related items