Font Size: a A A

Study On Key Techniques Of Uyghur Character Recognition

Posted on:2015-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:N N YangFull Text:PDF
GTID:2348330518483753Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
With the maturity of computer science and technology,optical character recognition has become an important research field of pattern recognition.But due to the limited use of geographical,morphological characters and complex written style,character recognition technology of Arabic and alphabet-based Uyghur recognition technology is lagging behind.This paper studies the following contents based on the recognition difficulties and structural characteristics of Uyghur text:extraction of Uyghur text area in natural scene images,extraction of Uyghur character features,design of Uyghur character recognition classifier,and finally,building a Uyghur character recognition system,details are as follows:(1)In the text area detection and extraction process,firstly use classical image processing technology for noise reduction and original image binarization,then find skeleton points of images and cluster them.Get original grayscale image intensity distribution level,and smooth them to get peak number.Increase the neighborhood of skeleton points according to the peak kind,exclude the small connected domain noise and extract the text area.Separate the extracted images to a series of character images,then normalize them to prepare for the extraction of character features.(2)In feature extraction,separate major strokes and sub strokes by connected component labeling as two separate study parts,and then extract their features.For Major Strokes,mainly extract their features of pixel distribution,grid density feature,ring hollow feature and projection morphological feature.For subsidiaries strokes,extract four directions radon transform features,umbers of subsidiary strokes,feature of the positional relationship between Main Strokes and Sub Strokes,feature of multiple sub strokes positional relationship.(3)In classifier design,proposed a four-level classification mechanism:on the basis of segmentation of major strokes and sub strokes and feature extraction respectively,input major strokes grid density characteristics and sub strokes radon transform features into BP neural network classifier for initial recognition of every strokes.Because of more types of master strokes,for the case of false identification due to shape similar of main strokes,make a feature correction vector by combining ring hollow feature and projection morphological to match the local structure of test image and recognition results,detect they are consistent or not.Secondly,classify according to the number of sub strokes;For multiple sub strokes,and the situation of a character contains multiple similar sub strokes,we classify character by determine position relationship between major stroke and all sub strokes.Finally,determine the position relationship between each sub strokes and connecting style of all strokes.Combining those connecting results with the major stroke recognition result,get the final recognition result.(4)Design a system with matlab GUI which contains the entire recognition process,providing the basis for subsequent research.
Keywords/Search Tags:Uyghur character recognition, text region extraction, feature extraction, classifier design
PDF Full Text Request
Related items