Font Size: a A A

Fast Keyword Spotting In Handwritten Chinese Documents

Posted on:2016-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:G YuFull Text:PDF
GTID:2348330479953249Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of computer and network, digital resources is becoming more and more rich. To facilitate the editing, storage and transport, more and more paper documents are converted into digital documents. With a large number of digital resources in the form of images instead of text encoding, how to efficiently manage and use the document information, especially for quick retrieval of the document content is an important research direction.Although numerous works of document retrieval have been published in recent years,to achieve both high retrieval rate and high speed remains a challenge. On large database of multi-writer offline handwritten Chinese documents, this paper proposes a fast real-time keyword spotting method.Firstly, this paper built a handwritten Chinese documents keywords spotting system based on over-segmentation and character recognition. And index files are generated from the multiple candidate recognition results which calculated on the candidate segmentation-recognition lattice of document image. Then keywords are retrieved from the index files, which accelerates the retrieval speed largely while preserving the accuracy.Secondly, the initial generated index which contains a lot of redundant information is very large. Considering the context information between characters, the TOP-N optimal context path are searched form the candidate segmentation-recognition lattice in text lines.The new index is generated from them. This method can compress the size of index files and accelerates the retrieval speed while preserving the accuracy.Finally, a series of experiments about index compression and keywords retrieval are carried out on the handwritten Chinese documents database CASIA-HWDB. The experiments results demonstrate the effectiveness of the proposed method.
Keywords/Search Tags:Handwritten Chinese Document Image, Keyword Spotting, Index Compression, Fast Retrieval, Beam Search
PDF Full Text Request
Related items