| Nowadays, the informatization course of our country has been stepped into a new phase, which is more significant for the development of commercial banks. The amount of financial document has been risen greatly because of not only the domestic active financial market stimulated by the increasing costumer requirements but also the global business extending of these banks. The amount of these kinds of documents increases so rapidly that they are becoming not tractable. This urgent situations demand for efficient and effective automatic financial document processing systems..A typical automatic financial document processing system always consists of front end cheque processing system and background subsequent supervision system. When a bank note is inputted into a cheque processing system, the elements of the bank note such as the check number, account, date and value will be extracted, processed and recognized automatically by image processing and OCR techniques. Then, these information will be send to the subsequent supervision system. Some unique elements will be treated as the index of this bank note, which will help to search and store those images. The subsequent supervision system will store these information and verify the correctness of the information by comparing them with the information provided by the core business system. For those cheques with seals, the seal recognition system will verify the validity of the seal by comparing with the corresponding valid seal existed in the seal base of the bank. All the procedures are done automatically by intelligent systems, so the valuable human resources are saved and the efficiencies of the business processing are greatly increased.As the most important component of the automatic financial document processing system, the bank document OCR system first performs the layout analysis and document classification tasks, and then extracts and recognizes the business elements automatically for further processing. It is a comprehensive system involving image processing, document analysis, form processing, feature extraction, classification and many other intelligence technologies. It is a typical application of Pattern Recognition and Artificial Intelligence techniques. The system is valuable not only for bank businesses, but also for document processing tasks in many other areas such as insurance,CIQ,revenue,education,post,hospital and government.The key point of a successful financial document automatic processing system is to keep good recognition rate with high reliability and robustness when confront different situations. To accomplish this purpose, this dissertation studies several pivotal problems in the system and gives corresponding practical resolution. The contributions of the dissertation can be concluded in three aspect(?).1. To improve the performance of layout analysis, we first introduce an accurate frame line detection algorithm on the basis of the characteristic of form document image. Secondly, a frame-line based classification method is proposed to identify financial documents in different categories. In the method, the matching degree between sample and bill templates is done through a new correlative matching model.The experimental results show the effectiveness of this algorithm.2. In the area of image preprocessing, three new algorithms are proposed: 1) A binary algorithm based on the MDE characteristic of strokes and the enhancement of the background restraint is proposed to overcome the difficulties of character extracting in complicated background derived mainly from the seal imprint with different types and positions, which are often dark and stroke-like. 2) An improved frame line removal algorithm is introduced. First, after the line detection procedure, chain code method is applied to describe the detected frame line region in gray images. Then, cross-points of characters and lines are detected, analyzed and marked by its overlapping types. Finally, frame lines are removed with the marks of cross-points. 3) A new segmentation strategy of unrestricted handwritten digits is realized by integrating the information obtained from profile analysis with topology structure analysis. The experimental results demonstrate the superiority over some other algorithms.3. In order to improve the performance of handwritten digit recognition, two methods with different emphases are proposed: 1) an algorithm that fuse the structural feature and statistical feature is proposed to make good use of the complementary information of these two different features and enhance the recognition rate remarkably; 2) another method that combines LDA(linear discriminant analysis ) methods with AP( affinity propagation) clustering method is proposed, which not only avoids the disturbance of noise in the training set, but also improves the recognition efficiency. The two proposeed algorithm work well on both simulation data sets and real-world applications.Finally, we present two application instances of our system in the bank subsequent supervision system and the sub-system of CIS (the national Cheque Image System). The application results on real financial bill images illustrate the validity and practicability of our system. |