Font Size: a A A

A Study Of Key Techniques For Uighur Handwriting Recognition

Posted on:2015-06-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y M XuFull Text:PDF
GTID:1228330431962424Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Comparing to Latin letters and Chinese, few recognition researches are put oninherent cursive text such as Arabic and Uighur script. This dissertation presents severalkey techniques for offline and online Uighur handwriting recognition, especially forcharacter recognition, character segmentation and word recognition.For128Uyghur characters, a handwritten recognition algorithm based on radicaldecomposition and fusion is proposed. Firstly, a Uighur handwritten radical library anda radical dictionary of characters are established by decomposing the Uyghur characteras three types of radicals: main, affix and dot. According to the analysis of connectedstrokes, a robust radical description is obtained. Secondly, different feature extractionsand classifications are designed for various types of radicals. The radicals will bematched to detect and identify slight differences between similarities. To reduce theinterference from handwritten topological deformation, a statistical feature named astime division directional feature is presented on online radical. Finally, the characterrecognition result is obtained by fusing the outputs of multiple radical classifiers.A multi-radical self-adaptive fusion scheme for the character recognition algorithmis proposed. To realize the self-adaptively fusion of the radicals, a real-time confidenceweighting method is described which uses the confidence distribution of radicals toestimate their fusion coefficients. Several radical fision strategies are developed whichrespectively based on weighted sum method, weighted naive Bayesian model andimproved D-S evidence theory. The contrast experiments had confirmed that the fisionstrategy based on improved D-S evidence theory can overcome the weakness ofweighted sum fusion in identification, as well as the over-sensitivity of the Bayesianfusion to the noise. Thus the performance the character recognition algorithm iseffectively increased on recognition rate and stability simultaneously.To solve the problem of character segmentation difficult due to cursive writing andthe phenomenon of stroke drift, a new character segmentation algorithm based onmultiple information weighted fusion is proposed. Firstly, a graphemeover-segmentation algorithm based on main segmentation and additional clustering ispresented. The graphemes are fuzzy matched to obtain the robust over-segmentationprimitive sequences. Secondly, the matching information is estimated by constructing amatching position Gaussian model to reduce the interference from stroke drift. Finally, acharacter sequences Markov model is established, so that the weighted fusion formula of word posterior probability is derived based on the Bayes criterion. By integrating theinformation of grapheme matching, recognition confidence and semantic information,the character segmentation result is obtained when the optimal grapheme matching andmerging path is achieved.A segmentation-driven recognition system for handwritten Uighur words based onfeedback structure and grapheme analysis is proposed in the dissertation. To avoid theerror accumulation problem from sequential structure, a feedback structure isconstructed to control the the results of character segmentation and word recognitionsimultaneously. Accordingly three feedback errors are estimated and responded, whichare error of grapheme shape, error of character recognition and word matching error.Firstly, the word image is over-segmented into main and additional graphemes.Secondly, a feedback-based grapheme merging strategy is designed to provide the bestsegmented character sequence. Then, combined with the structure information obtainedfrom the procedure of character segmentation, a hierarchical hybrid Uighur characterclassifier is designed to enhance the character recognition accuracy. Finally, a two-leveldynamic time wrapping is presented to select the best hypothesis of character sequenceand decide the word class.
Keywords/Search Tags:Handwriting recognition, offline, online, Uighur language, characterrecognition, character segmentation, word recognition, radical, multiple information fusion, feedback
PDF Full Text Request
Related items