Font Size: a A A

Research And Implementation On Table Detection And Table Structure Recognition Method Based On Deep Learning

Posted on:2024-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:H R NiuFull Text:PDF
GTID:2568306923956089Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the current era of big data,documents have experienced explosive growth and occupy an important position in the massive data files.Tables appear particularly frequently in documents,especially in scientific research literature,financial documents,and government documents.With the development of the times,tables gradually occupy a dominant position in document pages.However,due to the semi-structured characteristics and complex layout of tables,their internal information is difficult to directly apply.Therefore,in order to achieve automated processing of tables and the acquisition and utilization of internal data information,research on table detection and table structure recognition has attracted more and more experts and scholars’ attention in recent years.Although significant progress has been made in research on table detection and table structure recognition,these two tasks still face many challenges due to the diversity of tables.Firstly,the current methods mainly focus on framed tables.For sparse and wireless tables,it is difficult to determine the table area through wireframes,which can easily lead to the loss of table information and even the inability to detect the table.At the same time,unlike handwritten forms,document forms usually use less line tables and frameless tables such as three line tables for aesthetic and efficient purposes.These document forms usually contain rich Semantic information and unique spatial information in document layout.Existing methods often only utilize the structural information of tables,neglecting the exploration of spatial information,and they overlook the spatial modeling of document images and the fusion of table structural information to enable models to make better predictions.Considering the above problems,this paper proposes two models:a table detection method based on object detection and a table structure recognition method based on Transformer structure.The main work is summarized as follows:(1)A table detection method based on object detection is proposed,which specifically includes three modules:feature extraction,feature enhancement,and feature fusion.In the feature extraction module,a channel attention mechanism is introduced to learn the importance of each channel of the feature map,and to give higher weight to the channel that plays a key role in detection.In the feature enhancement module,a method for enhancing the feature extraction of table location information is proposed,and the spatial characteristics of the table are fully extracted by weighting the spatial location of the table.In the feature fusion module,a feature pyramid fusion structure is proposed,which performs convolution extraction on the fusion features,thereby improving the table detection effect of the model.(2)A table structure recognition method based on the Transformer structure is proposed.The specific model includes two parts:encoding and decoding.Among them,in the encoding part,a plug-and-play feature enhancement module based on attention mechanism is proposed,which can be directly inserted into the encoder of CNN’s Backbone and Transformer structures.The decoding part uses a decoding strategy based on memory cache to speed up the decoding process.(3)The comparison experiments of table detection and table structure recognition were carried out on the general data set of table detection and PubTabNet(partial),TAL_OCR_TABLE data sets,which proved that the model proposed in this paper has achieved relatively advanced results;Meanwhile,a large number of ablation experiments verify the effectiveness of the proposed module.(4)A system based on VUE and Flask that can be used for table detection and structure recognition is designed.By passing in the detection picture,the result of the system’s table detection or structure recognition on the picture can be obtained.If there is an error in the output result,it can also be corrected by the correction module.
Keywords/Search Tags:Table detection, Table structure recognition, Spatial information, Object detection, Attention mechanism
PDF Full Text Request
Related items