Font Size: a A A

Design And Implementation Of Identification System Of Financial Statements

Posted on:2012-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhengFull Text:PDF
GTID:2178330335950183Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Statements in the financial field is a very effective manner for showing the data, however it obviously can not meet modern people's Requirement for high efficiency if using manual input to a computer for analysis, statistics, and storage when the data's amounts is vast. And over time, Chinese characters or Arabic numerals will be difficult to confirm, therefore we research and develop this recognition system dealing with the report data based on image. The main function of this system is to convert paper reports into the image, and then the data on the statements could be identified through the system. This data can be stored in the computer, to facilitate future inquiries and changes. This paper studies a large number of relevant documents and image recognition technology before this thesis completed.Firstly, the thesis carried out image preprocessing research, it is mainly applied the image transform, image binarization, image tilt correction techniques to this system. The transform of the image is a process that change the oblique or the reversed images into the standard images. This article introduces the method of image rotation simply, and the specific rotation will be used in the image's tilt correction. The binary of image can be multiple gray level image transform into only two gray-scale images, which make the next step of image's identify less difficult, so the binary of image processing is a indispensable step. The image's tilt correction is to rotate the image which are oblique, we used the method of Hough transform to check the image's oblique, the image correction make the image in a standard state and conducive to the character recognition.Then, This article describes the contents of the character information extraction in third chapter, characters in table format is a major target in this article. As the report is composed of tables and data, data extraction must rely on the form. Only the table positioning accurate, can we extract character completely. Generally speaking,handwritten character often overlapping with the table frame, so we must handle overlap section. First, we should detect table in the image and characters in the form. If characters overstep the boundaries of the table, we need to adjust the frame of the report. The border of report will be shifted to a suitable location, and removed the previous border of report. After the separation of form and character has been achieved, we have to the separate character and the character. Characters often disjunctor, it is necessary to separate the siamesed character to identified characters accurately. In this paper we propose improved algorithm for extracting line PR and new characters of forms extracted location and extraction algorithm MRCCC.Feature extraction is automatic recognition processing is another key part of the field, due to the extraction of characters in front of the goal is to identify the characters and identify the way is through the character of the feature extraction, and use the extracted features and the correct characters Matching characteristics to achieve the purpose of identification. This is the feature classification, feature matching an important prerequisite. In this article, use the "first rough sub-subdivision " the combination of the two feature extraction feature extraction method.Chapter 5 based on the content of the preceding sections, the corresponding development of a simple experimental system of functional ----" Identification System of Financial Statements "to test our experimental results, and the system has made the summary.
Keywords/Search Tags:Automatic identification, Hough, PR, ECFEA, MRCCC, VC++, Feature extraction, Classification
PDF Full Text Request
Related items