Font Size: a A A

Small Samples Identification In Chinese Postaladdress Based On Lexicon Driven

Posted on:2015-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhongFull Text:PDF
GTID:2298330452950136Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Over the next five years, average annual growth rate of postal mail and expressdelivery business maintains all the time in20%above, however, the traditional way ofmail delivery and sorting capabilities relying on manual work is difficult to dobusiness. Most of the postal letter recognition and sorting are using the bar code,identification code, zip code or address information to achieve mail distribution. Themost conspicuous failings of that way is time-consuming, laborious task andinefficiency because of the zip code or address information usually input by manualthrough a terminal. By comparison, there is effective and fast way of using imageprocessing techniques to recognize zip code or address information by machine.By using zip code to realize mail sorting automatically, the zip code vacancy,handwriting de-normalization or error on a letter usually leads to errors and failures.That mistake will cause the letter delayed or lost. In order to gain higher recognitionrate, dictionary driven method is proposed to address the automatic segmentation andrecognition in this paper, which also preliminary tries to solve the address stringsegmentation and identify problems. By using the dictionary driven model, we finallyrealize to recognize the recipient’s provinces urban districts and zip codeautomatically and suggestions for further research were proposed. The mainconclusions of this paper are:Firstly, the techniques of image pretreatment and segmentation: imagepretreatment includes denoising, binaryzation, tilt correction, address extract andnormalization. Using the method of rows projection to segment the letter image andextract the address information. Then utilize the method of Local Column projectionto shard the address strings.Secondly, handwriting postal address recognition: paper adopts two-stagecharacteristics method identification. The first level use the improved coarse gridfeatures, or the peripheral and inner features to words coarse classification, in order toreduce the secondary identification word candidate set. The secondary level use threedirection stroke density features or local Fourier transform fine classification forcoarse classification vocabulary candidate set.Thirdly, Dictionary driver model: as there is a huge cursive, adhesion, noisepollution, broken pen in the handwritten Chinese characters, characters are not independent and complete after the Chinese character image segmentation in wordsrecognition process. It produces a great impact on the individual character recognitionrate. Using dictionary driven approach to cut the address text image, there will be setof parts. Using dictionary prior knowledge to find the optimal segmentation path, itcan reduce the search space. If a character in province, city, area is written wrong ornon-standard, it can automatically correct recognition.Under the condition of small postal address dictionary and coding dictionary, theproposed algorithm is tested. Experimental results verify the effectiveness of based ondictionary drive model address search path optimization and the two levels ofrecognition algorithm of the secondary peripheral characteristics of coarseclassification and the direction stroke density fine classification.
Keywords/Search Tags:envelope address, dictionary driven, Chinese character segmentation, local Fourier transform
PDF Full Text Request
Related items