Font Size: a A A

Table Segmentation Of Complex Bill Based On Deep Learning

Posted on:2021-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:M W YangFull Text:PDF
GTID:2428330614971351Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the development of economy,bills are used more and more in daily life.In the process of bill-information review and financial reimbursement,financial personnel need to manually enter the required information into the computer system,which is a heavy task.With the development of layout analysis and text recognition technology,automatic bill recognition and entry become possible.Bill recognition and automatic entry include bill layout segmentation,text detection and recognition,etc.Since the actual demand for bill recognition is to output structured information,so bill layout segmentation has a very important role.This paper proposes a deep learning-based bill table segmentation algorithm for the structured output of complex bill in the form of a table.The main work of this paper is as follows:1.Use deep learning to extract the heat map of table cross candidate points in the bill.According to the characteristics of bill,the network structures of Richer Convolutional Features(RCF)and Point-Pair Graph Network(PPGNet)are analyzed.The RCF network uses VGG16 as the backbone network,and improves the network for the task of detecting the crossover candidate points of the bill.PPGNet uses the FPN network with pyramid pooling as the backbone,the part which detects the end points of the line is used to determine the crossing candidate points in the bill.2.Perform noise suppression on the extracted cross candidate heat map,and adopt the method of hierarchical clustering to convert the heat map into the coordinate information.On the synthetic table data set and the actual bill data set,compare with the coordinates in the ground truth to calculate the recall rate and accuracy rate.The final junction detection results show that PPGNet network is better than RCF series network.3 Coherent Point Drift(CPD)algorithm is used to match the cross points and realize the segmentation of bill table.Firstly,the junctions set extracted from the bill to be segmented is registered with the junctions set of the bill template,and the corresponding relationship between the two points is obtained through the probability matrix.Then,through the composition information of the bill cell corresponding to the template junction,the vertex composition information of the bill cell to be divided is determined,and finally the division of the bill table is completed.The algorithm of this paper is tested on the actual bill data set,and the experimental results show the effectiveness of the proposed algorithm.
Keywords/Search Tags:Deep Learning, Junction Detection, Hierarchical Clustering, Point Set Matching
PDF Full Text Request
Related items