English Scramble The Text Recognition System Design And Implementation

Posted on:2008-11-13

Degree:Master

Type:Thesis

Country:China

Candidate:Z Li

Full Text:PDF

GTID:2208360212999920

Subject:Software engineering

Abstract/Summary:

With the fast development of the international technology interacting, multi-language document are becoming more and more popular. It brings a new research topic in Document Recognition: the recognition of multi-language document. In China, Chinese and English mixed documents are very common. The difference between different languages requires the classification of the characters in the document, and recognize with different methods.Based on researching the current OCR systems and related technologies, the paper presents the Recognition of Chinese/English mixed Character system. The works are as following:Firstly, for improving the quality of preprocessing and overcoming the shortcomings of regular character segmentation methods, this paper introduces a novel approach for Chinese/English mixed characters segmentation which based on periods and recognition. By employ a new line-segmentation algorithm, the approach provides a more precise line-segmentation result. By using a new character classification arithmetic and a new Chinese character component union arithmetic, it produce a better segmentation result for Chinese/English mixed .character.Secondly, this paper introduces an efficient architecture for character recognition. It provides a portable, extendable platform for the developer and user. Based on this platform, users can change the work flow of recognition dynamically for better recognition rate. For the maintainer of the system, the work became more convenient.Finally, through in-depth study of key steps in recognition process, we conducted an analysis and comparison of various algorithms, and the advantages and disadvantage of each sceneIn one word, it achieves a recognition rate of 95% and a speed of 6s for one hundred Chinese characters using the recognition system in pure Chinese character context based on above algorithms. The English character recognition rate is more than 85% in Chinese/English mixed character environment.

Keywords/Search Tags:

Chinese character recognition, character segmentation, mixed characters, feature extraction

Related items

1	Research On Off-line Handwritten Chinese Character Recognition System
2	Research On Off-line Handwritten Chinese Character Recognition System
3	Study On The Key Points Of The Image Acquisition And Processing For Characters Pressed On Labels
4	Study On The Feature Extraction And Recognition Of The Pressed Protuberant Characters On A Metal Label
5	Research On Segmentation Algorithm Based On Statistical Classification For Mixed Characters
6	Algorithms Research On Thinning, Feature Extracting And Similar Chinese Characters Recognition For Off-line Handwritten Chinese Character Recognition
7	A Study On The Segmentation Of The Mixed Arranging Character
8	The Study Of Segmentation And Recognition For Video Image Character
9	The Study On The Mechanism Of Humanoid Recognition For Video Image Chinese Character
10	Research On Characters Extraction And Recognition Method Of Off-line Handwritten Chinese Character Based On Procedure Neural Networks