Font Size: a A A

Research On Financial Invoice Processing And Intelligent Identification Algorithm

Posted on:2021-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:H TianFull Text:PDF
GTID:2518306107993239Subject:Engineering (Control Engineering)
Abstract/Summary:PDF Full Text Request
In recent years,the vigorous development of financial accounting has also promoted the generation of a large amount of currency transaction data.Some of the transaction data are recorded in electronic invoices,and most of the transaction data are recorded in printed paper notes.Because of its unique anti-counterfeiting and regulatory characteristics,They exist in large numbers of corporate repositories.At present,most enterprises will hire a large number of professional accountants to manually enter the paper bill data,and then perform some financial calculations and aggregations.The biggest problem exposed in this way is that a lot of time and economic costs are wasted,and the accuracy of the entered data is not satisfied enough,which affects the efficiency of the enterprise.This paper researches and designs a set of algorithms for automatic identification and processing of financial bills for the above-mentioned paper bill entry issues,which can effectively shorten the processing cycle,improve bill entry efficiency,and avoid the risk of data misrecording and omission.The main work of the paper is as follows.First,the classification of invoices.There are various types of corporate financial bills,such as VAT invoices,transportation invoices,and so on.This paper comprehensively compares the processing results of the visual bag-of-words model and the CNN deep learning model,and decides to adopt an improved Le Net neural network to classify the seven common ticket templates with a classification accuracy rate of98.54%.Second,invoice preprocessing.Due to the influence of environmental factors during scanning or shooting,some ticket samples appear skewed,low contrast,distorted,and mixed noise when entering the system.At the same time,the ticket itself will also carry interference factors,such as seal concealment.In order to make the subsequent positioning and recognition algorithm process more accurate,this article mainly uses the Open CV image processing toolbox to adopt a combination of rotation transformation,perspective transformation,color gamut separation,and contrast enhancement to perform preprocessing operations on the invoice pictures.Third,the area of ??interest of the invoices is calibrated.This paper proposes a method combining the ROI ratio screening positioning algorithm and the relative coordinate reference positioning algorithm,which divides the positioning into coarse positioning and fine positioning.At the same time,it uses the Hough transform to find the reference coordinates in combination with the layout features of the invoices,and calibrates the region of interest of the invoice image,improve accuracy and speed effectively.Fourth,text cutting and recognition.This paper improves the iterative text cutting recognition algorithm,emphasizes the continuity and accuracy of the processing algorithm flow,and integrates the convolutional neural network,recurrent neural network,and CTC sequence alignment model to build the text recognition process into a CRNN end-to-end algorithm model.It solves the problems of increasing complexity and decreasing accuracy brought by character segmentation and character recognition split design.The thesis uses Java and Python language to deal with the storage and distributed scheduling of a large number of invoices,and finally implements an automatic invoice recognition and processing system.The experimental results show that the overall recognition accuracy rate is 91%,and the average processing time of a single bill pattern is within 1200 ms,which greatly decreases manpower and economic costs and achieves the expected effect.
Keywords/Search Tags:Ticket recognition, convolutional neural network, image processing, CRNN end-to-end model
PDF Full Text Request
Related items