Font Size: a A A

Research On Machine-printed Mongolian Font Classification And Word Image Super Resolution Based On Deep Learning

Posted on:2020-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y WenFull Text:PDF
GTID:2428330596492645Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the fast development of digital technology,a large number of Mongolian literature resources(such as book,journal,magazine and so on)can be converted into corresponding electronic documents by optical character recognition(OCR)technology.But there are two problems in the conversion process.First,the existing Mongolian OCR system usually recognizes characters by means of segmentation.However,Mongolian words in some fonts are difficult to be accurately divided into characters,resulting in unrecognizable words.In addition,the unique word formation of Mongolian leads to a huge vocabulary(multi-million vocabularies).And the phenomenon of multiple fonts in the document is widespread,which makes it too time-consuming to train a Mongolian word recognition system that recognizes a good deal of Mongolian characters in different font.Therefore,if it is possible to determine the font to which the word to be recognized belongs,a separate word recognition system can be solely constructed for each font in advance.This method could effectively solve the above problems.Second,the resolution of the printed Mongolian document image collected in some cases is too low,which will result in the word in the document image not being effectively recognized by the Mongolian OCR system.Therefore,it is necessary to reconstruct for low resolution images to generate high resolution images that can be effectively recognized.For the first problem,this thesis takes the printed Mongolian word image as the object and regards the font classification as a special image classification task.A printed Mongolian font classification method using Convolutional Neural Network(CNN)is proposed.The Mongolian word image is used as the input of the convolutional neural network,and the corresponding Mongolian font type is used as the class label.According to the Mongolian word formation and its writing characteristics,this thesis designs a shallow CNN architecture and compares it with three classical CNN models(including: LeNet-5,AlexNet and GoogLeNet)on related data sets.The experimental results show that the performance of the designed CNN architecture in this thesis is better than that of LeNet-5 and AlexNet,and can achieve the same performance as GoogLeNet.But its network structure is shallower and the parameters are less.Therefore,the CNN architecture designed in this paper can be effectively solves the problem of printed Mongolian font classification.For the second problem,this thesis takes the low-resolution printed Mongolian word image as the object,and uses the Deeply-Recursive Convolutional Network(DRCN)model to make super-resolution.The low-resolution Mongolian word image is reconstructed into a corresponding high-resolution Mongolian word image using DRCN.On the relevant dataset,the DRCN model is compared with the traditional interpolation algorithm(NN Interpolation,Bilinear Interpolation and Bicubic Interpolation).The experimental results show that the DRCN model can effectively reconstruct high-resolution Mongolian word images,thus solving the super-resolution problem of word images.
Keywords/Search Tags:Machine-printed Mongolian, Font Classification, Convolutional Neural Network, Super-Resolution, Recurrent Neural Network
PDF Full Text Request
Related items