Font Size: a A A

Document Image Classification And Retrieval Based On Convolutional Neural Networks

Posted on:2018-12-06Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2428330566451612Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the development of computer technology and the prevalence of image acquisition terminals,more and more documents are collected and processed in the form of images.The design of appropriate document image processing system to deal with the growing number of document images,become a hot research.Document image classification and retrieval are two major tasks in document image processing system.This thesis focuses on document image classification and retrieval,and its main contribution is as follows:1)propose a document image classification method based on convolutional neural network and support vector machine.Our method obtains high-level visual features from raw image pixels via convolution neural network,and achieve satisfying classification performance with support vector machine.In this paper,three classical convolution network architectures are tested on two open source datasets with different number of samples.Experiments show that out method has better performance,and can be easily transfer between different types of datasets.2)propose a document image retrieval method based on convolutional neural network and hierarchical k-means tree.After image features obtained via convolutional neural network,normalization and principle component analysis are applied for dimension reduction.In retrieval phase,approximate nearest neighbor search based on hierarchical kmeans tree is chosen to provide better retrieval efficiency for large datasets.Experiment show that out method can achieve acceptable precision but less retrieval time,which proves to be of high practical value.The experiment results show that method based on convolution neural network is superior in performance,and the accuracy of classification and precision of retrieval can meet the existing requirements.
Keywords/Search Tags:document image classification, document image retrieval, neural networks, feature compression, approximate nearest neighbor search
PDF Full Text Request
Related items