Font Size: a A A

Research On Limited-set Defaced Chinese Character Recognition

Posted on:2009-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:W Y SunFull Text:PDF
GTID:2178360272977111Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Limited-set defaced Chinese Character Recognition has important meaning of the research of Chinese Character Recognition, which plays great part in License Plate Recognition System and ID card Characters Recognition. The integrity Printed Characters Recognition has achieved very good results, but little research has been done in the Limited-set Defaced Characters'Recognition, and therefore it has great practical significance and research space.The objects of this system is 100 Chinese characters with a certain extent defaced by scanned. The entire system is divided into the following steps:1.In Chinese Character image's preprocessing. The characters selected for experiments in system are relatively clear, with a little noise. So this paper uses the linear file to smooth the image, and uses the algorithm of globally binarization to two-values the image, as following the linear normalization method was implemented in accordance with the characteristics of graphics.2.In the part of the character's thinning, this paper presents an improved thinning algorithm based on mathematical morphology. This paper raises thinning rules work at T strokes; these rules solve the problems of depressed distortion at T strokes with the traditional morphological thinning algorithm. It also presents a tracking algorithm of three-bifurcation burs of endpoints. This algorithm eliminates traditional algorithms'bifurcation successfully. The results of this experiment demonstrate the effectiveness of the algorithm3.The characters feature extraction, first introduce the commonly feature extraction algorithm simply, then based on thinned character, this paper elicit the strokes'number of character and the interrelationship of each stroke as the features of the character.4.In recognition phase, this paper presents a plan to improve the two-tier serial classification structure, which divides the limited-set characters into three categories, named left-right, up-down, and the independent three groups. In order to identify a subset of Chinese characters, we need to make the characters of left-right and the up-down structure for a rough classification according to standard parts. In particularly classification stage, the characters or sub-characters are matched based on character's eigenvector matrix matching algorithm which are classified by three cases in the character depot.We select 100 samples; all of them have a certain degree of pollution. The rate of recognition is 94%.
Keywords/Search Tags:Chinese character recognition, limited-set, sub-stroke, stroke-extraction eigenvector matrix, chain list matching
PDF Full Text Request
Related items