Font Size: a A A

Research On Semantics-to-Signal Scalable Image Compression Methods

Posted on:2023-04-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:K LiuFull Text:PDF
GTID:1528306902464264Subject:Data Science (Information and Communication Engineering)
Abstract/Summary:PDF Full Text Request
Image/video contains rich semantic information,being one of the main information sources for human.The rapid growth of image and video data puts forward higher requirements for signal compression and makes it a challenge to understand massive images with limited manpower.The progress of computer vision technology has promoted.the industrialization of machine analysis,but machine vision could not completely replace human observation and decision.The scene of human-machine co-judgment will exist for a long time.In practical applications,images/videos are mostly utilized for storage,transmission and analysis in the form of compression.Therefore,image/video coding need to serve both human vision and machine vision.Traditional image coding methods focus on minimizing signal distortion for human vision under limited bit rate.But the signal distortion introduced by coding tends to affect the accuracy of machine analysis.Machine vision-oriented image coding aims to minimize semantic misalignment under limited bit rate,and ensures the accuracy of bitstreams for machine analysis by compressing the semantic features.However,it is often difficult to reconstruct images based on features.We combine the advantages of the above two coding methods in image fidelity and semantics fidelity,and propose a new semantics-to-signal scalable image coding framework.In our proposed scalable coding framework:Firstly,images are represented as semantic features by revertible transform,note that features and images can be bidirectionally mapped.Then,a feature decoupling constraint is introduced to enhance the feature discrimination.Finally,multi-feature joint layered coding technology is designed to jointly optimize the bit rate,signal distortion and semantic misalignment during compression.Some new problems in image coding for human-machine vision are explored.The main work and innovation of this disseration are as follows:·This disseration proposes a semantics-to-signal scalable revertible transform technology.Considering that existing methods can not ensure energy invariance and semantic concentration simultaneously,we combine the nonlinear network with the traditional lifting structure,achieving feature extraction and imagefeature bidirectional mapping.To implement semantics-to-signal scalability,the revertible transform is optimized by task-driven loss.As a result,the image information is distributed to the multi-resolution features.Note that shallow and deep features represent the structure and semantics characteristic of images,respectively.The results show that the proposed transform can achieve the image information structural representation of semantic analysis task.·This disseration proposes a semantics-to-signal scalable feature decoupling technology.To solve the problem of lack of explicit constraint on feature distinction in the proposed revertible transform,a feature decoupling technique is studied by using semantic tags as a weak supervised constraint.A signal decoupling constraint based on generative adversarial network is proposed for the shallow features.For deep features,the interaction constraint of homogeneous features based on bidirectional mapping and compact representation constraint of heterogeneous features based on variational autoencoder are proposed.In addition,simulation data are constructed to evaluate decoupling efficiency under supervised conditions.The results show that this method reduces the information coupling between features.·This disseration proposes a semantics-to-signal scalable multi-feature joint layered coding technology.Aiming at the problem that compression needs to take into account human-machine vision,a multi-feature joint layered coding framework is proposed based on the structural representation and feature decoupling.First,we design a resolution adaptive feature compression unit to encode single-layer features with different dimensions.Second,a feature prediction unit is designed to reduce inter-layer redundancy.Third,we design a post-processing unit to improve the image reconstruction effect.A rate-distortion-misalignment joint optimization strategy is adopted to achieve end-to-end optimization.The results show that the proposed scheme achieves semantics-to-signal scalable image coding,which has high precision in the accuracy of machine analysis,and the subjective quality of reconstructed images is better than that of BPG.According to the above design,we can achieve lossy coding of image for humanmachine vision.The bitstream is partially decodable,which can support coarse-to-finegrained machine vision tasks through progressive decoding,and gradually recover the image.The results on multiple datasets demonstrate the effectiveness of the proposed method.The semantics-to-signal scalable image coding proposed in this disseration combines image representation with joint coding,which explores a new direction for image compression in human-machine co-judgment scenes.
Keywords/Search Tags:Image Coding, Scalable Coding, Revertible Transform, Feature De-couple, Deep Learning
PDF Full Text Request
Related items