Font Size: a A A

Handwriting Oracle Bone Character Recognition

Posted on:2022-08-07Degree:MasterType:Thesis
Country:ChinaCandidate:Q W DaiFull Text:PDF
GTID:2505306326498724Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
OBI(Oracle bone Inscriptions)are the gene of Chinese culture and the root of Chinese traditional culture.With the deepening of Oracle research and the advent of the information age,Oracle research has entered the information age.At present,there are two problems in Oracle information processing.The first is the input of handwritten Oracle Bone Inscriptions.However,at present,the recognition rate of handwritten OBI is low,which is basically impossible to be applied in practice;The second problem is the digitization of Oracle documents.Most of the documents are displayed and stored in the form of screenshots or photocopies in computers,which is inconvenient to retrieve and use.The core of these two problems is the recognition of handwritten OBI.At present,the recognition rate of standard OBI is only 83%,while the recognition of handwritten OBI is more difficult.Deep learning technology has also made great progress in handwriting characters recognition.Its inspired us to use these technologies to realize the recognition task of handwritten Oracle bone characters.In this thesis,some character recognition methods based on deep convolution neural network technology are studied,and the handwritten OBI dataset is evaluated and verified,which improves the practicability of Oracle handwriting input method and explores the digitalization of OBI.The main contributions are as follows:(1)Construct handwritten OBI dataset for deep learning training and testing.Handwritten OBI is different from handwritten Chinese characters,OBI have high intra-class similarity and complex fonts.In the process of making the dataset,according to the word library in "Yin Qi Wen Yuan-Oracle Big Data Platform",the handwritten OBI are collected by using the data collector.Unicode sixth plane coding is adopted to encode the collected handwritten OBI dataset.According to the handwritten OBI dataset in this paper,we collected 83,245 sample images,and according to the font coding.according to the font coding.which were divided into3,881 types of OBI images.(2)Based on the handwritten Oracle bone character data set and the recent research ideas of handwritten Chinese character deep learning model,the recognition technology of handwritten OBI based on convolutional neural network is studied.Before training the handwritten OBI recognition model,the dataset is preprocessed,and the OBI images size is uniformly adjusted to 96×96×3.On the basis of adjusting the classic Convolutional Neural Network recognition model and handwritten Chinese character recognition model to improve the recognition accuracy of handwritten OBI,the Full Connection,Global Average Pooling,Global Weighted Average Pooling and Global Weighted Output Average Pooling are taken as the whole connection layer of the network for improvement.While ensuring the recognition accuracy,the model size is reduced and the recognition speed is improved.Finally,the recognition accuracy of the model is 97.67%.(3)The handwriting input method of OBI is designed.Py Qt5 is used as the client interface to write the OBI,and realize the recognition of the handwritten OBI by interacting with the recognizer.In addition,it also sets up the functions of copying the OBI and saving the handwritten OBI images,which have been applied in practical engineering.(4)The digital system of OBI documents is designed and developed.According to the particularity of Oracle literature,the modules of layout analysis,text segmentation,text recognition and so on have been developed and applied in practical engineering.
Keywords/Search Tags:handwritten OBI recognition, deep convolution neural network, handwritten OBI dataset, OBI handwriting input method, OBI digitization of documents images
PDF Full Text Request
Related items