Font Size: a A A

Design Of Mongolian Standard Compliance Detection System Based On Deep Learning

Posted on:2020-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:C Y ZhouFull Text:PDF
GTID:2428330596492264Subject:Computer technology
Abstract/Summary:PDF Full Text Request
At present,the Mongolian information standardization system is being constructed,and there are only two standards conformity testing tools in Mongolian language.One system uses the characteristics of extracting the Mongolian white-body of the printed body and calculates the similarity.A fixed threshold is determined to determine whether the typeface is consistent with the national standard typeface.Second system is mainly to determine whether the given typeface is the same as the national standard type by manual identification.The above two detection systems have their own advantages and disadvantages,but they all fill the gaps in the Mongolian standard compliance testing tools to varying degrees.Because they are not efficient in practical applications,it is difficult to promote the implementation of the published Mongolian information technology standards.Therefore,in order to solve the above problems,this paper has carried out related research on Mongolian standard conformity testing,realized the Mongolian standard conformity detection system based on deep learning,and made a few points for Mongolian information standardization construction.contribution.The main research content of this paper includes the following points:(1)A data set of traditional Mongolian code conformity detection based on national standards is constructed.First store the encoding sequence to be tested in the txt file,switch between different fonts,and manually intercept the picture of the corresponding area,divide the picture into Mongolian words by OCR technology,and then rotate the obtained target picture counterclockwise 90° and save it,after the above operation,the character image conforming to the national standard is stored in the training set by manual calibration.Since the published font files are few,the constructed data set is a small data set that is not satisfy the conditions for deep learning to classify.For this reason,the data set is correspondingly enhanced.This article uses the Image Data Generator class in the deep learning framework Keras for sample expansion.The training set consists of data augmentation.Twelve published Mongolian white-body fonts are used,and then the verification set is randomly divided from the training set according to the ratio of 0.25 to jointly train the classification model.The testing set is a font of a white font that uses three products with a relatively large market share.(2)Because the traditional Mongolian word image is more difficult to character segmentation and some features are inconvenient to extract,this paper chooses the convolutional neural network model which is brilliant in the image classification field as the classification model.The experimental model uses the LeNet-5 model,and the comparison models use the improved Lenet-5 convolutional neural network model and the AlexNet model.In the contrast experiment,the classification performance of the basic model and the comparison model are observed by changing the influencing factors such as the size of the input image and the number of trainings.Finally,the AlexNet model with best classification effect in the experiment is selected as the Mongolian standard conformity testing.The model has an accuracy of 98.72% on the Mongolian encoding character testing set,98.48% on the Mongolian conversion rule testing set,and 100% on the Mongolian resource testing set.In this paper,the Mongolian standard conformity detection system based on deep learning is realized by PyQT5 under Windows system.The experimental results have achieved good results and can meet the practical application.
Keywords/Search Tags:Mongolian encoding, standard conformance test, convolutional neural network, complex text layout engine, the national standard
PDF Full Text Request
Related items