Font Size: a A A

Research On Business Card Recognition System Based On OCR Technology

Posted on:2009-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y K WuFull Text:PDF
GTID:2178360242492799Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the actual commerce and the economic activity, the business card has already become an important status information carrier. The business card may divide into two kinds roughly according to the language type: bilingual languages and one language. The question of the mixture of Chinese and English languages is one of the questions of printing recognition. And the present business card document layout analysis algorithm's complexity is high, no suitable. This article attempts to solve these problems through to study some new technologies.In the paper the author makes the following contribution:(1) The paper elaborates application necessity of the business card recognition system, gives the common the total frame chart and analyzes business card's overall characters.(2) The paper presents a method for card document layout analysis based on mathematical morphology. By some morphological operations and search algorithm, the proposed method can analyze a complex business card document layout quickly and accurately.(3) Against the existing character segmentation methods can not be accurate for lines segmentation in Chinese/English mixed environment and the different size of fonts. This paper introduces a novel approach for Chinese/English mixed characters segmentation which based on periods and recognition. The method make use of the Chinese characters separation algorithm based on the character spacing cycle and achieve the determinant of the type of connective region. Finally, the algorithm completed the union of connective region of the Chinese characters by using a new Chinese character component union arithmetic based on recognition. The experiments show that this method of character segmentation accuracy is better than traditional projection based on the lines segmentation algorithm.(4) In this paper, on the basis of heuristic rules-based information classification algorithm , we propose to use layout information in images to improve automated categorization for text information in business cards.(5) Put up related tests for new algorithm of the paper, validating the system performance.
Keywords/Search Tags:BCR, OCR, character segmentation, information classification
PDF Full Text Request
Related items