Font Size: a A A

Multi-font Tibetan Print Recognition Based On Neural Network

Posted on:2022-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z J SanFull Text:PDF
GTID:2518306482473374Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The research of multi-font Tibetan printed character recognition is one of the key technologies of Tibetan optical character recognition,and it is also an important link in the research of Tibetan printed character recognition.The research on multi-font Tibetan printed character recognition currently mainly uses statistical methods,and the statistical method of multi-font Tibetan printed character recognition is in the conditions of a small number of samples,poor computer performance,and poor model generalization and portability.Down can meet the needs of the time.Statistical method research provides good ideas and results for multi-font Tibetan printed character recognition research in theory,model,and method.However,there are the following problems in the entire multi-font Tibetan printed character recognition research: One is The scale of training data resources is small and the accuracy of the data is low,so it cannot support the requirements of deep learning for data scale and accuracy;second,there is currently no literature using deep learning theories and methods to discuss multi-font Tibetan recognition;third,The types of Tibetan fonts studied are single or few.In response to the above three issues,this article mainly completed the following work:(1)In this paper,544 modern Tibetan characters are extracted from 43 million bytes of Tibetan corpus,and then 90 Tibetan characters and 12 Tibetan fonts in Ume style.(2)Firstly,48960 Tibetan Printed Character Datasets(TPCD)and 6528 Tibetan printed character datasets are constructed?CT(TPCD?CT),and the data in the two data sets are labeled accurately;Secondly,the automatic recognition of multi font Tibetan characters is discussed by using the deep learning method based on neural network;Then,TPCD is used to build,test and evaluate the model;Finally,through the comparative experimental analysis of various baseline models,it is found that neural network can significantly improve the recognition rate,recall rate and F1 value of multi font Tibetan print recognition task on the test set.(3)The research on the recognition of multi font Tibetan print is carried out by using the current mainstream deep learning method.The results of the application of the model resnet50 and vgg16 to the task of multi font Tibetan typeface recognition have been achieved.(4)TPCD and TPCD?CT composed of 544 Tibetan characters Ding(90 kinds of Wujin and12 kinds of Wumei)are disclosed in this paper,which basically contains all the Tibetan printing ink font used in computer field at home and abroad.
Keywords/Search Tags:Tibetan, printed character dataset, neural network, resnet50, vgg16
PDF Full Text Request
Related items