Research On Segmentation-free Word Spotting Of Handwritten Ancient Documents

Posted on:2019-09-11

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Qiu

Full Text:PDF

GTID:2428330566486084

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

In the research of ancient documents,the documents should be digitalized and stored as images by scanning.As the amount of data growing,a searching system needs to be built.However,most documents were written by hand.Traditional way of indexing the words from documents requires a segmentation preprocessing.As the casualty of handwritten characters,it's not easy to segmentation the words correctly.Thus,method based on segmentation-free word spotting becomes a tendency of research.At present,the difficulty of segmentation-free word spotting lies in the large variance of handwritten characters by different people and the length of different words.To avoid the error of segmentation and raise the precision of index,we do some research based on segmentation free method:(1)A feature based on Multi-Layer Convolutional Network is proposed for raising up the precision.The framework of neural network is based on the one proposed by Visual Geometry Group(VGG).We use it to extract convolutional features to improve the precision of the system.During training and indexing,the system extracts Multi-Layer Convolutional Features of index images and negative samples.With a trained E-SVM sclassifier,the system can yield the score of the region covered by a sliding window.The method is tested on a twenty-page dataset,containing 4860 words.The system reaches the mAP of 57.6%,6.8% higher than that of HOG feature.(2)A multi-scale classifier is proposed to improve the precision of short words and address the problem of scale variance.The system extracts feature under different scales and trains 3 E-SVMs classifiers by Stochastic Gradient Descent(SGD)algorithm.Non-Maximum Suppression is used to eliminate some areas overlapped with each others and select the candidate region of high score.This method improves the mAP of words in length of less than 5 effectively,reach the rate of 52%,which is 2.7% higher than that without multi-scale classifiers.Combining with Multi-Layer Convolutional Features,multi-scale classifiers are trained for indexing.Consequently,the mAP rate of 58.7% is achieved.

Keywords/Search Tags:

Machine Learning, Image Processing, Segmentation-free

PDF Full Text Request

Related items

1	The Research And Implementation Of Image Processing Technology Based On Extreme Learning Machine
2	Research On High-speed Sewing Machine Needles Appearance Quality Detection System Based On Image Processing Technology
3	Automatic Recognition And Segmentation Algorithm Research Of Fruit Image Based On Machine Learning
4	Research On Machine Learning Based Sequential Ultrasound Bovine Follicular Image Set Segmentation Algorithm
5	Research On Device-free Target Perception Technology Based On Machine Learning
6	The Study On Image Segmentation Based On Visual Saliency Of Image And Improved SVM
7	Research On Cross Datasets Segmentation Of 3D Human Brain MRI Image Based On Machine Learning
8	Application Research On SVM And Extended Algorithms For Digital Image Processing
9	Research And Application Of Image Segmentation Model Based On Machine Learning And Multi-modal Fusion
10	Face Image Synthesis Method Research And Application Using Machine Learning Based Image Generation Algorithm