Font Size: a A A

Research On Key Technologies Of Handwriting And Print Segmentation In Electronic Dossier

Posted on:2019-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:H WangFull Text:PDF
GTID:2428330548463611Subject:Software engineering
Abstract/Summary:PDF Full Text Request
On July 28,2016,the Supreme People's Court issued the "Guiding Opinions on Promoting the Synchronous Generation and In-depth Application of the Electronic Dossiers of People's Courts with the Case" throughout the country.Subsequently,the work of electronic dossiers with case generation and deep application is carried out throughout the country.The purpose is to convert the litigation documents collected and generated during the processing of various cases to the electronic dossiers through the scanner.At the same time,the electronic dossiers can be automatically classified into the corresponding directories through text recognition technology.This realization will be of great significance to the improvement of the court's case management,the quality and efficiency of case handling.However,considering that most of the electronic dossiers actually generated contain both handwriting and print,and the existing text recognition technology is only for a certain font form,so how to separate the handwriting from the print in the electronic dossiers is very necessary and urgent to carry out different word recognition processing to improve the recognition accuracy.This article focuses on the key technologies for the segmentation of handwriting and print in electronic dossiers.The main research contents and innovation points of this paper are as follows:First,considering the difficulty of providing the key information in the color itself and simplifying the information,the electronic dossier is grayed out and binarized to separate the text from the background.Secondly,aiming at the problem of angle skew in the process of generating electronic dossiers,an effective geometric correction method is analyzed on the basis of two valued images.The main method is to apply the commonly used image tilt angle detection method to electronic dossiers,including Hough transform method,projection method,cross-correlation method,K-nearest neighbor cluster method,and then the rotation of electronic dossiers is corrected according to the detected tilt angle.Through the above several methods performance comparison,finally find the best electronic dossier inclination correction method.Thirdly,for the tilt-corrected electronic dossier,an adaptive iterative handwriting and print segmentation method is proposed for the electronic dossier image feature to find the largest recognizable subgraph of the handwriting and print,and then recognize text characters by OCR.In this process,the concept of adaptive row granularity is proposed.Next,the electronic volume is divided according to the adaptive row granularity,and then the sub-regions obtained after the row segmentation are further divided into columns,and the blank regions are removed.Fourthly,handwriting and print are separated from the handwritten character string after column segmentation,and a cluster-based connected region outline extraction method is proposed.For text strings that do not contain handwriting,line segmentation based on adaptive line granularity is performed.Through such continuous division,the recognizable text area is continuously reduced,and the division between handwriting and print is finally achieved.Finally,300 copies of the actual electronic dossiers were selected,and on the PC of the Windows operating system,the Visual Studio 2013 development environment was used to test the above segmentation methods.The adaptive iterative handwriting and print segmentation method proposed in this paper is suitable for electronic dossiers,and can achieve a better segmentation effect.
Keywords/Search Tags:Electronic dossier, Handwriting, print, Image segmentation, Tilt correction
PDF Full Text Request
Related items