Font Size: a A A

Research On Layout Identification Of Formulas Based On Features Of Strokes And Structures

Posted on:2012-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhaoFull Text:PDF
GTID:2298330338995368Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Mathematical formulas can be edited by different formula editors. Every editor has one acquiescent and unified typesetting mode, so the differences exist between layouts of mathematical formulas. According to this factor, the adaptability of formula recognition system can be enhanced and the reconstruction results will be much more accurate.The algorithm of layout identification of mathematical formulas is designed in this paper based on the different way of typesetting of mathematical formulas. Considering popular application of Word(MathType) and Latex, the differences of the two formats of formulas are mainly analyzed. Through analyzing the typeface of variables, operators, digitals and the relative positions among structure characters, the algorithm of layout identification are proposed. The process of identification mainly contains font discrimination based on features of strokes and structure features between symbols. When discriminate font, thinning and Hough algorithms are used to extract italic variables and projecting and intercepting characters are used to find the most different feature points that are used to identify font layout. When discriminating structure, simple structures are analyzed and the relative position of centroids is calculated to identify layout. The layout of mathematical formulas is identified finally. The experiment indicates that the algorithm can achieve a relative high accuracy.
Keywords/Search Tags:Mathematical formula, Layout identification, MathType, Latex, Font, Structure
PDF Full Text Request
Related items