Font Size: a A A

The Hierarchical Ocr System For Aviation Weather Map, And Several Of Its Key Technologies

Posted on:2011-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:X G TianFull Text:PDF
GTID:2208360308966955Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
OCR (Optical Characters Recognition) system is one system which is dealt with the digital document. Along with the development of digital document, the traditional OCR system has been represented the weakness. Specially, when meets the digital document such as Aerial Meteorological Maps which is in the complex circumstance and abundant data including the characters, signs and lines it couldn't resolve.The subject investigated in this dissertation came from the commercial project"OCR System for the Aerial Meteorological Maps with Digitalization and hiberarchy". Based on the analysis of the reasearch status and existing problems of the theories and related technologies for OCR system, the research work, which would be discussed about the related important fundamental theory and approaches for Image Segmentation, the Seperation between the figure and text and Object recognition, has been done to achieve the target of developing the experimental prototype of OCR System for the Aerial Meteorological Maps with Digitalization and hiberarchy. The main research contents and findings are under follow:(1) We propose an effective approach for segmentation of the foreground and background based on the Oblique 2-D Histogram of Entropy. Moreover, according to the theory of K-L distance, we propose another measurement due to derease the complexity and the cost of time. The whole approach is based on the combination between the part information and whole information, which could resolve two edges sword problem as the segmentation in part information would bring the errors of the segmentation in the whole, the whole segmentation would bring the half-baked information in the part area.(2) For the seperation approach between the figure and text, we are mainly depended on the theoretical proof by the morphology. But in this approach there is one important structure item which is decided to the effect of whole. So through the theoretical derivation and experiment, we prove the size of the item and analyze the related reasons for effection. Compared with the traditional approaches, this approach represents well in the time, speed and percent of recognition. (3) We propose one new effective approach for line recompose,recognition which could fill out the blank in this area. According to the mathematical expression for the line to line and PSO algorithm, we discuss and analyze further and systematically the related theories, so the proposed approach own the merits that is not only fast but also is the accurate for the recomposition which could be repaired for Real-line. Furthermore, we also find the discrepancy between the different styles of lines (Dashed-line and Real-line), which Dash-line is counted by the result of regular distribution, Real-line vice versa. So the energy function could achieve this goal to separate.(5) We propose one novel moment expression. According to it, we deduce the moment invariants and proof the range of moment invariants orders. This approach is mainly based on the Center Moment and Complex Moment, so it owns the merits from two aspects: not only has none influences by the orientation, but also could change into other expressions in the different polars. Moreover, we also point out the problem for the moment invariants that not dependent or independent invariants are the best. If want to do better, we should select the invariants in the whole orders.(6) For the single characters pattern matching, we propose the multi-level Normalized Cross Correlation algorithm. This approach by the Jensen's and Cauchy-Schwarz inequality, DGA graphical structure and Breath-First Searching rules, and the winner-update algorithm could resolve well for the problems which are the better invariants need to select again in the whole orders and the high cost of time. Moreover, this approach used the theory of database in the Hash Table could improve the performance in the searching final result and allocate the memory rationally.(7) Related to single character recogniton and the bionic Drop-Falling Algorithm for segmentation, we would expand the approach into resolving the recognition of merged-characters. Specially, aiming to the existed problems in the process of Drop-Falling Algorithm, we introduce the theory of kinematics into algorithm. So the approach would be more reasonable and perfect. For experimental results, it has better performances cost the short time wherever in the linear merged, nonlinear merged and overlapped.The theory and key technologies mentioned above employed, a prototype of OCR system has been fabricated sucessfully. It is shown by its test results that can genertate the better recognition percent for the characters or lines (above 98%), ensure the intact results after the image segmentation and seperation between the text and figures. The research work of this dissertation is helpful to advance the development of OCR system.
Keywords/Search Tags:OCR System, Image Segmentation, separation of the lines and context, recomposition and recognition of Line, characters recognition
PDF Full Text Request
Related items