Font Size: a A A

Design And Implement Of Taxi Invoice Information Recognition Algorithm

Posted on:2021-07-30Degree:MasterType:Thesis
Country:ChinaCandidate:P Y LiFull Text:PDF
GTID:2518306308970979Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of economy and society,people's material life has also been greatly improved.All kinds of consumption in daily life will generate invoices.These invoices may come from taxi invoices,air travel itineraries,train tickets and so on.The information in some invoices is very valuable when the enterprise reimburses,the reimbursement on taxi invoice is the most common.Therefore,designing an information recognition algorithms for taxi invoice has great research value and significance.Aiming at the problem of taxi invoice information recognition,in the third chapter of this paper,the text detection and recognition algorithm is designed from the traditional OCR processing method.Using the maximum stable extreme value region MSER and HSV positioning algorithm to complete the extraction of the invoice area in the natural scene and the positioning of the key information of the taxi invoice.Precise cutting with the help of projection method,training for pin-type fonts dedicated to taxi invoices,and generating pin-type character sets.Finally call Tesseract to complete character recognition.In the fourth chapter,this article uses the deep learning technology that has developed rapidly in recent years,and completes the code development based on Tenserflow and Keras framework.Collected and marked the taxi invoice training sets.After many iterations of training and adjusting network parameters,a model with high recognition rate is obtained.Combine CTPN and CRNN to complete the text detection and end-to-end identification of the area where the key information of the taxi invoice is located.Finally,in Chapter 5,the taxi invoice information recognition system was built,and the coding of the system engineering was completed using Python and pyQT5.The system can satisfy the uploading and identification of taxi invoice picture and has great robustness.This paper designs text detection and recognition algorithms suitable for taxi invoices based on Tesseract and deep learning.The system is built in Python and the two recognition algorithms are encapsulated.The traditional Tesseract recognition accuracy rates for taxi number,ride date,ride time and ride amount are 64%,75%,71%and 70%,the overall recognition accuracy rate is 70%.The accuracy rate of using CTPN to detect text areas can reach 97%,and the accuracy rate of using CRNN to identify text information of taxi invoices can reach 83%.
Keywords/Search Tags:Taxi Invoice, OCR, Deep Learning, Text Detection, Text Recognition
PDF Full Text Request
Related items