Font Size: a A A

Design And Implementation Of A Chinese Recognition System Based On Deep Learning

Posted on:2020-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:S DingFull Text:PDF
GTID:2428330599458956Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In recent years,great progress has been made in optical character recognition because of the introduction of deep learning which is developing rapidly.There has been commercially available products for text detection and recognition in documents,and researchers have gained abundant accomplishment about character and text recognition in natural scenes.However,Chinese characters are more difficult to be recognized than English words since they have large number of types and complex spatial structures.There are relatively few research results that aim to recognize Chinese characters,which is the reason of this area has large development space.In this paper,I discuss the difficulties of Chinese character recognition,study and introduce the latest algorithm about Chinese character recognition based on deep learning and build the network proposed by the new algorithm with Pytorch.Beyond that,I produce a Chinese characters dataset and use it to train and test the network which I build,improve the network according to the experiment results and get better recognition performance,which is helpful to the research of Chinese characters recognition in documents and natural scenes.The main research contents of this paper are as follows:First of all,this paper has produced a data set of Chinese characters that contains all 27533 characters which belong to GB18030-2005 national standard.Each character has the same number of images,which keeps the balance among classes.I use this data set train and test Radical Analysis Network(RAN),a kind of neural network that recognize Chinese character by predicting its Ideograph Description Sequence.The ability of RAN for zero-shot learning has been verified and evaluated through this experiment.Then,I explain a few concepts such as radicals and spatial structure of Chinese characters and Ideograph Description Sequence,after that,I expand on the architecture,working mechanism of Radical Analysis Network and its superiority compared with general neural network in this paper.In order to evaluate the RAN ' s ability of generalization,I build RAN with Pytorch by our own and design experiments to train RAN with data set that I produce,and test its zero-shot performance.Beyond that,I also use CTW Chinese character data set to verify whether RAN can be robust enough in the wild.The experiment results has proven RAN can predict Chinese character with high accuracy in the natural scenes.Finally,this paper analyses the deficiencies of RAN when it's evaluated on CTW data set,and make improvement of RAN by adding Spatial Transformer Network on it.Spatial Transformation Network is often used to improve the network's ability of spatially invariant.I add it on the top of RAN and retrain the new network,the experimental results show that RAN network with STN can be more robust when the input is distorted and the recognition performance is better.This paper introduces the latest deep learning arithmetic aim to recognize Chinese characters in detail,implements it and designs experiments to evaluate effectiveness and feasibility of the arithmetic.This work is meaningful for subsequent research of Chinese character recognition in the real world.
Keywords/Search Tags:Chinese character recognition, radicals, Ideograph Description Sequence, Radical Analysis Network
PDF Full Text Request
Related items