Font Size: a A A

Recognition Of Handwritten Chemical Equations Based On End-to-end Trainable Neural Networks

Posted on:2021-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:M Y KongFull Text:PDF
GTID:2428330605958659Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and artificial intelligence,education informatization has begun to affect and change traditional education methods.Online answering and other human-computer interaction scenarios are becoming more and more common.Handwriting recognition has become a research direction in the field of computer vision.Recognizing handwritten characters is a simple matter for humans,but it is very complicated for computers.In recent years,the development of deep convolutional neural networks has brought revolutionary changes to the field of computer vision.The combination of convolutional neural networks and recurrent neural networks has achieved great success in image-based sequence recognition problems,promoted progress in the field of handwriting recognition.At present,research on handwriting recognition mainly focuses on English characters,numbers and Chinese characters,and has achieved good results in these fields.However,these recognitions are limited to one-dimensional space.Due to the complex two-dimensional space structure and length,the recognition of handwritten chemical formulas is still a difficult task.To solve this problem,on the one hand,it can promote the development of handwritten chemical equation recognition,and on the other hand,it can be applied to online solutions,such as quickly correcting assignments as an auxiliary teaching method,and quickly entering chemical equations into computers.This paper mainly does the following work for offline handwritten chemical equation recognition:(1)Data sample collection based on electronic pen entry,since there is no public handwritten chemical equation data set,we have manually collected a new data set,includes 6586 samples of handwritten chemical equations.(2)An offline handwritten chemical formula recognition training method based on end-to-end neural network is proposed,we use the CNN+RNN+CTC model,which is one of the latest methods in image-based sequence labeling tasks.CNN+RNN helps to better image representation,and CTC is a loss function that does not need to be aligned,eliminating the tedious work of labeling the corresponding position of the data.Experiments show that this model also performs well in the task of recognizing handwritten chemical equations,and can better learn the spatial information contained in chemical equations.(3)We optimize on the basis of CNN+RNN+ CTC model,select prefix beam search as the CTC decoding method,and introduce two dictionaries during the decoding process,they include 447 and 1990 different prefixes respectively,through the code in initialization in memory,the recognition process takes up little memory and is fast.The introduction of the dictionary makes up for the disadvantages of CTC conditional independence to some extent.Experiments prove that this method is effective,and the accuracy of model recognition is further improved.(4)This article conducts experiments based on the above two methods,selects seven representative experiments in detail and finally compares the two methods under the same network configuration and experimental conditions,the model without the dictionary reached 85.43%formula-level accuracy and 92.30%character-level accuracy;the model introduced into the dictionary reached 87.67%formula-level accuracy and 94.53%character-level accuracy.
Keywords/Search Tags:handwritten chemical recognition, convolutional neural network, recurrent neural network, CTC, dictionary
PDF Full Text Request
Related items