Font Size: a A A

Image Compression And Semantic Quality Assessment For Optical Character Recognition

Posted on:2017-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:D D WangFull Text:PDF
GTID:2308330485953812Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development of mobile multimedia applications, the need for the Internet of human is no longer limited to information communication, image sharing, etc. The cloud is characterized by a large amount of computing resources, storage, and data. It would be more convenient to take advantage of the cloud to do all kinds of image processing of the multimedia big data, such as image retrieval, object recognition and so on. In order to reduce the coding rate for transmitting images to cloud, we need to retarget image compression and image quality assessment for mobile media application. Different with traditional image coding and image quality assessment, we propose two methods as follows to save coding rate in this paper.From the perspective of the code itself, image compression had been extensively studied for reducing coding rate producing acceptable visual quality. However, there are many application scenarios where the compressed images are used for automatic recognition rather than human viewing, thus the visual quality is no longer critical for compression. As we know, SIFT features have demonstrated their utility in many recog-nition scenarios and SIFT-preserving compression is developed recently. In this paper, we firstly study the SIFT-preserving compression of license plate images for recogni-tion accuracy rather than visual quality. According to extracted SIFT features, each image is divided into SIFT coding-units and non-SIFT coding-units. Each coding-unit is assigned with a different quality parameter when using JPEG for compression. We compare our proposed scheme with the standard JPEG that uses a unified quality pa-rameter. Experimental results with manually tuned parameters show that on average 14% bit-rate can be saved by our scheme, without any loss of recognition accuracy.Furthermore, we propose a reliable image semantic quality assessment (ISQA) method to optimize coding rate. Considering that information sink is a computer vision algorithm but not a person, we argue that the quality of compressed images should be evaluated from its preserved semantic-related features, instead of its pixel-wise fidelity (e.g. PSNR) or visual quality (e.g. SSIM). In this paper, we make an empirical study of an ISQA approach based on SIFT features extracted from both original and compressed images, and we formulate an optimization problem to find the operating point of an image compression system for text image recognition. Then we calculate the average bit rate of compressed images under different operating points. Experimental result-s show that our proposed ISQA measure is significantly better than PSNR and SSIM in predicting the recognizability of compressed text images. Accordingly, using our ISQA measure during compression leads to more than 58%(37%) bit rate saving com- pared to using PSNR (SSIM). In addition, we explore the correlation between subjec-tive evaluation and objective evaluation obtained by our ISQA measure for recognition. Experimental results show that the PLCC between our ISQA measure and subjective e-valuation reaches 0.8401, and the RMSE is as low as 0.5325, which demonstrates high consistency with human perceptual quality for text image recogintion.From the perspective of mobile-cloud computing application scenarios, we firstly propose a new image compression approach and image semantic quality assessment al-gorithm, and we have verified the effectiveness of the proposed algorithms by extensive experiments. The deficiencies in the algorithms will be concerned in our future work.
Keywords/Search Tags:Image compression, Image semantic quality assessment, JPEG, OCR, SIFT, Dense SIFT, PSNR, SSIM
PDF Full Text Request
Related items