Font Size: a A A

Research Of Factors And The Strategies To Improve Accuracy Of OCR Recognition On The Text-based Digital Images Of Information Resource Digitization

Posted on:2012-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:J GuoFull Text:PDF
GTID:2218330338956704Subject:Information Science
Abstract/Summary:PDF Full Text Request
Optical Character Recognition, is a part of the work of information resource digitization. Its life cycle includes four stages:selecting digital scanning objects, producing digital images, treating digital images and optical recognizing of digital images. The accuracy of OCR has ben the focus of attention, when it is used in the work of digital information resources, because the accuracy of OCR is a very important factor to ensure the quality of digital products and provide guarantee of users for the whole work of information resource digitization.This article points that Optical Character Recognition is an organic part of the process of the work of information resource digitization. And the work of Optical Character Recognition is also a Complete system. Firstly, this article introduces the possible factors of Optical Character Recognition in each stage, based on the four stages of its life cycle. Subsequently, this article proposes corresponding strategies to improve the recognition accuracy of text-based digital images. In view of the festures of the work of Optical Character Recognition in the work of information resource digitization, this article focuses on the discussions of the factors of accuracy of the optical recognition on text-based digital images and the corresponding strategies to improve the recognition accuracy, in the stages of producing images, treating images and optical recognizing of digital images. The article is divided into the following four sections:the first chapter is the introduction, and in this chapter, the origin and significance of topics are introduced, the existing relevant research is overviewed, and the main research methods used and the innovation are described. In the second chapter, a series of influencing factors of OCR of texted based digital images are analysed comprehensively, based on the four stages of its life cycle. In the third chapter, targeted strategy of improving the accuracy of OCR of text-based digital images is introduced, based on the second chapter. Lastly, the content of this article is summarized, the inadequacies are pointed, and the direction of the future is cleared.
Keywords/Search Tags:accuracy of OCR recogntion, text-based digital image, information resource digitization
PDF Full Text Request
Related items