Font Size: a A A

The Content-analysis Based Image And Video Coding

Posted on:2015-03-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z B ShiFull Text:PDF
GTID:1268330428999923Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Image and video coding have been studied for more than twenty years and achieved huge successes. However, it is more and more difficult to keep improving coding efficiency in traditional ways. It is necessary to analysis and understand visual contents from a different angle. Luckily, the latest developments in computer vision inspire us to utilize visual content analysis technologies for exploiting visual correlations.In this thesis, we mainly study the collaboration on visual content analysis technologies and image/video coding approaches. These visual content analysis technologies focus on exploiting the visual correlations and help us to further remove visual redundancy. The main contributions of this thesis can be described as three parts.In the first part of our work, we propose a novel visual pattern based image coding mode. This mode utilizes pixel-level visual pattern analysis to exploit the pixel-level visual correlation. According to prior knowledge established by visual patterns, we can adaptively discard some high-frequency redundancy on the encode side, and then restore them on the decoder side. In this way, both of higher compression ratios and better visual quality can be achieved. Besides, we extend the visual pattern based approach into scalable video coding scenario to better balance the coding performance and scalabilities. We propose a novel visual pattern based inter-layer prediction approach. This approach utilizes searching and mapping among visual patterns to exploit spatial-temporal correlations and produce two enhanced signals to improve the precision of inter-layer prediction. We also adopt the parameter analysis based approach (e.g. using the coded base-layer HEVC quadtree information) to achieve high efficient inter-layer predictions as well as limited decoding complexity increases. By involving multiple content analysis technologies, our scheme supports both multi-loop and single-loop implements.In the second part of our work, we propose a novel feature based image compression scheme. This scheme utilizes robust image feature matching to achieve closer local correlations. We combine pixel based analysis and region-based analysis together to further reduce visual redundancy. Specifically, we first decompose an image into the global information and local information via multi-scale wavelet transform and SIFT feature extraction, and then compress them accordingly. The global information is the basic description of an image with limited redundancy. The local information is the SIFT feature extracted in different subbands. On the decoder side, we use decoded SIFT features to establish visual correlations with images in the cloud and further generate a group of visually similar image patches. Then, we employ aforementioned visual pattern based learning and mapping to merge these patches into global information and reconstruct the target image. The reconstruction begins from the lowest subband to the highest one iteratively. Taking advantage of two kinds of analysis approaches (the pixel-level visual pattern and region-level image feature), our method demonstrates good visual quality at very high compression ratios.In the final part of our work, we propose a novel feature based image set compression scheme. According to the global statistical characteristic of robust local features, we utilize feature distance to analysis visual similarities among images. According to feature distances, we divide a large image set into more compact subsets first. Then, we model the relationship among images in a subset as a directed graph. Each node of this graph denotes an images and the weighted value of each edge is the feature distance. By searching the minimum spanning tree with the smallest edge costs, we can achieve an optimal coding structure for a given image set. In order to further enhance visual correlations between images, we propose a feature based three-step inter-image prediction approach. At the first step, we involve multi-model estimation algorithm to estimate multiple geometric models for different image regions, and then reduce geometric distortions in these regions accordingly. Following this, we involve photometric transformation to eliminate variances caused by illumination changes. At the final step, we use block-based motion compensation to improve the local prediction precision. Our feature based scheme takes advantage of multiple content analysis approaches. The feature based global analysis determines more efficient coding structure; the feature based local matching enhances inter-image region-level correlations; the pixel based compensation produces more precise predictions. Thus, our scheme demonstrates much better performance, meanwhile shows potentials towards future big data analysis and compression in cloud environments.
Keywords/Search Tags:image compression, scalable video coding, image set compression, cloudstorage, visual content analysis, learning based visual pattern, localfeature, feature coding, global similarity, image alignment
PDF Full Text Request
Related items