| With the rapid development of computer,the recognition of image processing has been greatly improved,and the human-computer interaction ability has also been improved,and the pattern recognition technology has been greatly improved.The text recognition is one of the most important fields.Especially for the recognition of Chinese characters,the recognition of Chinese characters is relatively difficult because of the late start and the complex structure of Chinese characters.In this paper,the identification of ticket ticket information is studied.The identification information includes numbers,letters and Chinese characters,of which the main research is on Chinese character recognition.This paper first introduces the basic framework and theoretical knowledge of Chinese character recognition system OCR recognition,including the development and current status of OCR technology,as well as the present situation of sleeper ticket exchange train ticket,and analyzes the background and problems to be solved and research The meaning is elaborated,afterwards carries on the analysis to four modules respectively and puts forward the suitable processing way.In the preprocessing stage,ticket tickets are rather special because of colored noise.Therefore,a de-noising algorithm based on color space is proposed to create favorable conditions for binarization.After that,it analyzes all kinds of binary methods and puts forward the suitable binarization method based on the nature of train tickets to get satisfactory results.Later,because of train tickets,the information we extracted this time is not all of the information,so the train ticket layout analysis,get the text block we need.After the word segmentation,using projection-based segmentation method.Due to the poor quality of train ticket printing,there exists the phenomenon of adhesion and fracture in the textual information and the fine segmentation of the text after the rough segmentation.Because of the complicated structure of Chinese characters,we select rough peripheral features for feature extraction and C-means clustering to construct feature database.Afterwards,the features of word recognition feature are extracted by means of peripheral features and density features.Finally,in order to improve the performance,the feature space is compressed.At last,the single word recognition is used to analyze several commonly used recognition methods.The recognition method based on the template and the structure is proposed.The nearest neighbor method is used to judge the recognition result.Then,in order to improve the recognition accuracy,the post-processing module is added.After identification of the information extracted,stored in text form. |