Font Size: a A A

Research And Application Of Key Techniques For Printed Uighur Recognition

Posted on:2018-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:X D WangFull Text:PDF
GTID:2348330521450960Subject:Engineering
Abstract/Summary:PDF Full Text Request
Character recognition is an important branch of pattern recognition.The study of Uyghur recognition has important significance to inherit and develop the culture of ethnic minorities.The Uyghur recognition includes printed Uyghur recognition and handwritten Uyghur recognition.This paper focuses on the printed Uyghur recognition.There are two kinds of methods including the whole word based and character segmentation based to address it.The two methods have different requirements for training samples.The former has a higher requirement on the number of training samples while the latter just require 128 kinds of characters taken as training samples.Due to the characteristics of adhesive type,the segmentation of Uyghur is difficult.Therefore,how to improve the performance of the segmentation algorithm is a challenging task.This paper adopts the character segmentation method to study the printed Uyghur recognition and proposes a new segmentation algorithm which is validated by experiments.In addition,the zernike feature is extracted to classify the printed Uyghur character and the experiments prove that this feature can reflect the statistical characteristics of Uyghur characters well.Finally,the recognition algorithm is applied to printed Uyghur document recognition and translation system.The main contents of this paper are as follows:1.The document image preprocessing consists of three steps,binarization,noise reduction and skew correction.Specifically,image binarization is realized by the combination of iterative optimal method and OSTU method.The experimental results show that the algorithm can reduce the stroke break.The noise in the document image is divided into two categories: edge noise and salt-pepper noise.This paper adopts the Projection Profile Analysis method to eliminate the edge noise.By comparing four commonly used filtering algorithms,this paper adopts improved median filter to reduce salt-pepper noise.Finally,the Fourier transform and Hough transform are combined to correct the slant document image.2.This paper proposes an improved segmentation algorithm based on morphology and integral projection.This algorithm can avoid the setting threshold and improve the flexibility of the algorithm when acquiring the line document image.While splitting the connected segmentation,3/4 of the area below the baseline and the baseline area of the connected segment are set to white,then computing vertical projection to find segmentation points,which can address the problem of missing segmentation such as‘(?)'character.The experimental results show that the average recognition accuracy is 95.24%.3.This paper adopts zernike feature and utilizes euclidean distance calssifier to recognize printed Uyghur characters.Meanwhile,the recognition results of the zernike feature are compared with that of the gabor feature,four direction line element feature and gradient features,respectively.Experimental results show that the zernike feature performs better in printed Uyghur characters recognition and achieves the best accuracy 70.98% due to its good representation of statistical characteristics of Uyghur characters.4.This paper develops printed Uyghur document recognition system based on Windows operating system.We use the VS2010 and Open CV integrated development tools to develop the software which can be used in printed Uyghur document recognition and simple Uyghur-to-Chinese translation.The experimental results show that the Uyghur document recognition system performs well.
Keywords/Search Tags:Uyghur Segmentation, Feature Extraction, Connected Domain, Uyghur Recogniton
PDF Full Text Request
Related items