| In recent years,with the rapid growth of document generation methods such as scanning and photographing,the demands for document recognition technology is increasing.It is of great significance to automatically identify and mine useful information from various kinds of documents.As an efficient way of data organization and presentation,tables are widely used to store and display data in various documents such as books,financial statements and magazines,which are one of the most important data objects in document pages.Especially in the field of medical treatment,the data in various laboratory lists and physical examination reports are presented in the form of tables.It is of great practical significance to propose an effective method and design a system for automatically identifying the tables in medical reports and extracting the data.The main contents of this thesis are as follows:Firstly,this research creates a data set for table recognition and adds several types of annotations to it.The document images in the data set include the realistic scene document images obtained by scanning and taking photos with mobile devices,as well as the screenshots in Word and PDF.Most of the images are from physical examination reports and medical laboratory documents.In addition,part of the images in ICDAR2019 data set are transformed according to the characteristics of real scene document images and then added to this data set.The data set provides a variety of annotations including table position,cell position and text information.There are two types of annotations,HTML and XML.It is convenient to use different methods for multi-angle evaluation when training the model.Secondly,this research studies the tabular region detection task in document images,compares the tabular region detection algorithms based on Yolo v5 and Faster R-CNN deep learning models,and carries out a comparative study from the F1 score,weight file size,prediction time performance of the model,etc.Finally,a form region detection model is selected for the subsequent form recognition system.Thirdly,this research proposes an algorithm that combines table line information and table text block information to identify table structure in document images,which can identify various types of tables.The table structure recognition algorithm is based on the semantic segmentation model.The purpose of the algorithm is to segment the pixels of the table lines and integrate them into the table line information,which is combined with the text position information obtained in the table detection task.This method detects and identifies the input image data and reconstructs the complete table,which provides an important algorithm support for the system design.Finally,based on the above research results,this thesis designs and implements a table recognition system based on micro-service architecture.The front end of the system is built by the Vue framework,and the back end is supported and run by the Flask and Spring Boot framework.The system adopts modular design in function,which is composed of front-end page,back-end program and database.The system realizes the functions of document image uploading,form recognition,adjusting recognition results and downloading recognition results,extracting table data,analyzing table data and so on. |