Font Size: a A A

Research And Implementation Of Self-supervised Learning In Handwritten Recognition Of Off-line Mathematical Formulas

Posted on:2021-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:S W JiangFull Text:PDF
GTID:2428330623468516Subject:Master of Computer Technology
Abstract/Summary:PDF Full Text Request
With the development of the information age,the traditional education mode is endowed with new times connotation by the Internet.In the field of intelligent education,the automatic marking system based on off-line handwritten mathematical formula recognition can liberate teachers from heavy manual labor.However,due to the complex nonone-dimensional structure such as fractional class and radical in the element structure of mathematical formula,the progress of this aspect is very slow.In this paper,we build an off-line handwritten mathematical formula recognition system which can accurately extract and recognize mathematical formulas from complex background.The system is divided into the following three modules.1.Formula line cutting module,whose main function is to locate and segment mathematical formulas from the complex background based on the existing target detection technology.2.Structure analysis module,in view of the complex structure problem in mathematical formula,this paper proposes a semantic segmentation method based on the supervised learning,based on this method,we will be handwritten mathematical formula of fractional class,radical class,vector class,the subscript 2 d structure as well as the students answer,such as process of Chinese characters,the mathematical symbols such as delete pen to positioning analysis,and then complete the grammar analysis of the handwritten mathematical formula of two-dimensional structure,and through the traditional image processing algorithms for correction and improve the results of analysis.3.Sequence recognition module.According to the current sequence recognition technology,an image text recognition network is built.The system not only on the mathematical formula of two-dimensional structure of grammar analysis,and in view of the current industry that exist in the off-line handwritten mathematical formula recognition model cannot identify multiline text,unable to process formula of Chinese characters,unable to deal with problems such as the deletion of handwriting in formula are put forward effective solutions.Finally,in the photo data collected by our system from the answer sheet of middle school students in chengdu,the average accuracy of formula cutting reached 84.7%,and the error rate was reduced to 7.2%.The average pixel accuracy of the semantic segmentation algorithm based on self-supervised learning is 93.37% and the average intersection ratio is 81.22%.The average character accuracy of sequence recognition is 98.58%.
Keywords/Search Tags:Self-supervised
PDF Full Text Request
Related items