Font Size: a A A

Research On Unbalance And Anti-attack For Chinese Character Recognition In Natural Scenes

Posted on:2021-04-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:X J FengFull Text:PDF
GTID:1368330614950635Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Optical Character Recognition(OCR),as one of the fundamental computer vision tasks,still needs to be improved in recognizing Chinese characters in natural scenes.Chinese OCR often suffers from unbalanced and scarce training set conditions.The rapid development of deep learning in recent years brings great promotion to OCR in natural scenes,especially in dealing with languages which have a small set of simple characters(such as English with only 26 letters).The performance even surpassed human-eye recognition.However,for languages which have a large number of complex characters(i.e.Chinese),sample imbalance and other factors make deep learning based methods still far from satisfactory.This paper discusses and studies the challenges of the existing OCR algorithms in recognizing Chinese characters.The main contributions include four folds:Firstly,A Chinese recognition method based on Focal CTC Loss for Unbalanced dataset is proposed.During training on Chinese recognition task,deep learning model is inevitably faced with the problem of data imbalance,which leads to the insufficient training on low-frequency characters,thus affecting the overall performance of the model.In this paper,we propose a focused continuous time classification method which make the model automatically leans towards the under-learned samples during training without human intervention.Therefore,the influence of data imbalance is alleviated,and the performance is significantly improved.Secondly,A Refined-Dense Net for Chinese recognition is proposed.Due to the difficulty of Chinese recognition,deep backbone networks are typically leveraged for feature extraction,which increases not only the training difficulty,but also the memory/time consuming on prediction.Aiming at this problem,we propose an effective network refine method,which greatly reducing the model size and predict time,and effectively improve the accuracy of recognition.Thirdly,Handwritten Chinese Recognition Based on Meta-learning is proposed.Handwritten Chinese recognition is the most challenging subject on OCR.Due to the variety of handwriting fonts and insufficient data samples,model training is extremely difficult.In order to effectively alleviate this problem,we propose an effective metalearning based method which significantly improves the accuracy under the insufficientdata condition.Fourthly,An anti-attack method for Chinese character recognition based on boosting iteration method is proposed.In the process of Chinese character recognition,the neural network model is vulnerable to adversarial samples,so that the recognition model can make serious misjudgment on some test samples which are simple in the eyes of human.In this paper,an effective anti-attack method for Chinese character recognition is proposed,which can effectively avoid the attack of the recognition model against the adversarial samples.Our method effectively improves the stability of the character recognition model.
Keywords/Search Tags:Chinese OCR, Deep learning, Refined Network, Meta-learning, Anti-attack
PDF Full Text Request
Related items