| It segmentation is a key point and a difficult task as a preliminary work of the OCR system,which to ensure the correctness of the identification.In order to meet the requirements,accuracy and efficiency of the segmentation work are particularly proposed to be guaranteed.Todo Mongolian is an adhesive script.Letters connected through trunk lines and no blank space between them.There are three ways of writing in the beginning,middle,and tail of the word.Large deformation of the font result in the high degree of difficulty in segmentation.Based on the above reasons,after analyzing the current situation of the development of the segmentation technology,the integral projection method and the contour tracking method are used for the alphabetic segmentation of printed Todo Mongolian document images.The purpose of this paper is to select the most suitable segmentation method for Todo Mongolian characters on the basis of the analysis of its own characters,and to study the difficulties when encountered in the process of character segmentation.The following is the basic work of this paper.(1)To preprocess the scanned document image,analyzing and selecting the Median Filter Algorithm to denoise,selecting the maximum inter class variance method to make binarization processing,maximizing the elimination of the image interference information,and preparing for the early work for the accuracy of the following letter segmentation.(2)Rotating the image counterclockwise by 90~o to lay the vertical trunk horizontally so that the text orientation in the document image is horizontally distributed,at the same time,The text direction in the document image is also distributed horizontally.In this way,it is easy for line segmentation processing.Aiming at the possible mis-segmentation of the unique stroke attachments in Todo Mongolian alphabet,the solution of mark-locking is refered to use.Using morphological operations,the strokes'accessories are assigned back to the neighboring phrases to avoid the situation of mis-segmentation.(3)Firstly,the Integral Projection Method and the Contour Tracing Method are used to perform the letter segmentation experiment separately,but the segmentation effect is poor.Therefore,In view of the problem,combining the Contour Tracing Method with the Integral Projection Method,which former method mainly using Ramer-Douglas-Peucker algorithm to calculate the approximate polygon of the contour.The processing effect of the letter segmentation is better than that of the single Integral Projection Method.For the 30 text images,the accuracy of the segmentation of the total 66715 letters can be up to 97%... |