Font Size: a A A

Study On Orientation Free Unconstrained Handwritten Chinese Word Recognition

Posted on:2009-04-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:T LongFull Text:PDF
GTID:1118360245975361Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
This dissertation researches the orientation free recognition method for unconstrained cursive handwritten Chinese word. This handwriting recognition technology can let users to freely write cursive Chinese words in any direction without any constraint. However, this research still has a lot of very difficult problems to be solved, such as: problems of tilt correction of cursive handwritten Chinse words, segmentation problems of cursive Chinese words which may be written with characters connected or even partially overlapped, various writing styles of cursive handwritten Chinese words and the amount of Chinese lexicon is huge. In order to solve the above problems, the research has done in this dissertation is listed as follows:1. In order to solve the tilt correction of cursive handwritten Chinse word, an orientation detection method based on gravity center balancing has been proposed. Experimental results indicate that even for the handwritten samples rotated to a random direction (0°~ 360°), the proposed method is still able to detect the writing direction and correct the rotation. With this technique, the proposed recognition method for unconstrained cursive handwritten Chinese word becomes an orientation free method.2. When users writing a Chinese word with a fast speed, characters in the word may be written with stroke connected. When characters in a word are written tightly, strokes of connected characters may even have overlapped part. These situations make segmentation of cursive handwritten Chinese word very difficult. In order to solve these problems, an over-segmentation approach based on stroke segment extraction and connected stroke breaking is proposed. By this over-segmentation method, connected characters and even partially overlapped characters in a word can be over-segmented. This technique enables the later segmentation method based on recognition and lexicon information to correct segment and recognize cursive handwritten Chinese words with characters connected or partially overlapped3. As the recognition of words is based on single character recognition, single cursive handwritten Chinese character recognition has also been deeply researched in this dissertation. An on-line recognition method based on DTW and an off-line recognition method based on LDA and MQDF are proposed and combined. With their good complementarity, the combination of these two approaches significantly improves the recognition performance of cursilve handwritten Chinses characters. In order to build multi-prototype for various writing styles of Chinese characters, a novel simplified gravitational clustering method is also proposed. Compared with traditional K-means clustering method, the proposed clustering method can build better prototypes. The final training results of MCE can also been improved by the proposed method.4. A cursive handwritten Chinese word recognition method based on single character recognition and lexicon information is proposed. Even the single character recognizer can't output the correct result for each segmented character, if the correct result is in the candidate list, with the proposed method, the whole word still can be recognized correctly. In order to deal with the huge amount lexicon problem, we employed a hash map technique and have made the time complexity be O(1). Experimental results have shown that with the proposed method, word recognition rate reaches 91.67% while recognition rate for single Chinese characters segmented by hand only reaches 84.58%. Error rate has been lowered down from 15.42% to 5.23%. The error rate reduction reaches 66.9%, which indicate a big success of the proposed word recognition method for cursive handwritten Chinese word.5. Traditional MQDF classifier has excellent performance for handwritten Chinese character. However, it can't be applied to the hand-held devices for the storage problem of huge parameters. In order to make it possible to be applied to hand-held devices such as mobile phones to make most people benefit from the high recognition performance, a split VQ method based on subspace distribution sharing is proposed to greatly compress the parameters of MQDF classifier while the loss of recognition rate is very low. By the proposed technique, the storage of MQDF classifier has been decreased from 76.4MB to 2.06MB, with a big compression rate 97.3%. Meanwhile, the recognition rate is still above 97%, only decreased 0.88 percent. By this technique, high recognition performance MQDF classifer now becomes practical to be embedded in hand-held devices.Because word contains context information between characters, if the segmentation problem can be solved, the recognition performance will definitely be superior to single character recognition. And the freely handwriting recognition can make people write characters much more natural and faster, these points make us believe the research of freely handwritten Chinese words recognition will be the future research direction in Chinese handwriting recognition field.
Keywords/Search Tags:Handwriting recognition, word recognition, orientation normalization, compact dictionary, classifier combination
PDF Full Text Request
Related items