Font Size: a A A

Research On Network Compression Of Cross Modal Image Generation Based On Canonical Polyadic Decomposition

Posted on:2022-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:J Y LouFull Text:PDF
GTID:2518306509484904Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The cross modal image generation can directly generate corresponding images from text descriptions,which greatly extends the range of application in computer vision.It can be used in cross-modal retrieval,artistic creation,criminal image generation and dataset generation.With the rise of 5G technology in human-computer interaction,automotive drive,smart city and other fields,the demand for mobile terminal deployment is growing.However,existing cross modal image generation models cannot be deployed on mobile terminals due to its complex architecture and huge parameters.Therefore,this paper conducts the following compression research on the cross modal image generation network based on canonical polyadic decomposition(CP decomposition):(1)This paper proposes a compression reconstruction method,which integrates CP decomposition rules to decompose and reconstruct cross modal image generation network,avoid decomposing pre-trained parameters and replace the pre-trained parameters.It reduces high computation of traditional CP decomposition operation.Subsequently,this work pre-trains the reconstructed network layer by layer to initialize parameters of the reconstructed network.Finally,this work selects the appropriate learning rate for the reconstruction model and uses back propagation to train the reconstruction model.(2)Based on the feasibility of reconstructing cross modal image generation network,this paper proposes an end-to-end lightweight network architecture that do not require pre-training.It can improve the versatility of the architecture and reduce the time and resources consumed in the pre-training process.This architecture use CP decomposition to replace each convolutional layers of the original model with three small convolution layers.Then each small convolutional layer is taken as an encoder,and is added with a corresponding decoder.This is to stabilize the training process of CPGAN.In order to further stabilize the training,this work has also introduced Conditioning Augmentation module.In two classic cross modal image generation datasets Caltech-UCSD Birds-200-2011 and Oxford-102,a large number of experiments show that the proposed methods compresses about20% of parameters and floating point calculations on the premise of ensuring the quality of generated images.Results prove that the proposed reconstructed method and lightweight architecture CPGAN which combined CP decomposition rule are effective and feasible.
Keywords/Search Tags:Cross Modal Image Generation, Model Compression, Canonical Polyadic Decomposition, Autoencoder
PDF Full Text Request
Related items