Font Size: a A A

Research And Implementation Of Biz Card Recognition System Based On Tesseract-OCR Engine

Posted on:2015-11-26Degree:MasterType:Thesis
Country:ChinaCandidate:S WanFull Text:PDF
GTID:2298330422982743Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of mobile Internet, the use of mobile Internet, chat and evenwork is becoming a part of our daily lives. Meanwhile, the phone can solve a lot of work andlife in the FAQ, card management is one of the good aspects. Business card exchange is animportant part of exchanges, and how to manage a large number of cards received, but thelack of efficient methods. Use ocr (optical character recognition) technology to identifycontact information, you only need to shoot a picture on the card will be able to quickly e-card, and then save it to your address book, you can easily achieve card management.Relying on open source character recognition engine tesseract, this paper aims to achievean identification card to meet their basic needs, the source code is open, stable, fast api cardrecognition system for SMEs to carry out related business card management to provide basicservices.This paper describes the design and implementation of the business card recognitionsystem, introduced the system design objectives and implementation. The system ’s webserver using an efficient tornado&nginx architecture can provide a rapid response to therequest interface. Prior to identify characters, the first business card image preprocessing,eliminate the influence of confounding factors on the business card images to enhance theeffect of the subsequent character recognition. This article uses the open source tesseract-ocrcharacter recognition engine on the card, in order to improve the recognition accuracy rate ofEnglish-text characters, using its own character library training methods related characterlibrary training, get a good recognition effect. For recognizable characters, this paperanalyzes the various keyword categories on the card, using a hybrid approach to informationclassification assigned to meet the semantic character card information, and ultimately theuse of a mobile phone can directly identify electronic business card format returns to therequesting user.In this paper, research efforts have been made in the actual deployment on Ali cloudserver, run the test several times after intake, and got good results.
Keywords/Search Tags:Card Identification, Tesseract, Open Source, Image processing, Characterrecognition, Information classification
PDF Full Text Request
Related items