Research On Network Compression Of Cross Modal Image Generation Based On Canonical Polyadic Decomposition

Posted on:2022-05-21

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Lou

Full Text:PDF

GTID:2518306509484904

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The cross modal image generation can directly generate corresponding images from text descriptions,which greatly extends the range of application in computer vision.It can be used in cross-modal retrieval,artistic creation,criminal image generation and dataset generation.With the rise of 5G technology in human-computer interaction,automotive drive,smart city and other fields,the demand for mobile terminal deployment is growing.However,existing cross modal image generation models cannot be deployed on mobile terminals due to its complex architecture and huge parameters.Therefore,this paper conducts the following compression research on the cross modal image generation network based on canonical polyadic decomposition(CP decomposition):(1)This paper proposes a compression reconstruction method,which integrates CP decomposition rules to decompose and reconstruct cross modal image generation network,avoid decomposing pre-trained parameters and replace the pre-trained parameters.It reduces high computation of traditional CP decomposition operation.Subsequently,this work pre-trains the reconstructed network layer by layer to initialize parameters of the reconstructed network.Finally,this work selects the appropriate learning rate for the reconstruction model and uses back propagation to train the reconstruction model.(2)Based on the feasibility of reconstructing cross modal image generation network,this paper proposes an end-to-end lightweight network architecture that do not require pre-training.It can improve the versatility of the architecture and reduce the time and resources consumed in the pre-training process.This architecture use CP decomposition to replace each convolutional layers of the original model with three small convolution layers.Then each small convolutional layer is taken as an encoder,and is added with a corresponding decoder.This is to stabilize the training process of CPGAN.In order to further stabilize the training,this work has also introduced Conditioning Augmentation module.In two classic cross modal image generation datasets Caltech-UCSD Birds-200-2011 and Oxford-102,a large number of experiments show that the proposed methods compresses about20% of parameters and floating point calculations on the premise of ensuring the quality of generated images.Results prove that the proposed reconstructed method and lightweight architecture CPGAN which combined CP decomposition rule are effective and feasible.

Keywords/Search Tags:

Cross Modal Image Generation, Model Compression, Canonical Polyadic Decomposition, Autoencoder

PDF Full Text Request

Related items

1	Multi-target Localization With MIMO Radar Via Tensor Decomposition
2	Parameter Estimation Methods And Applications Based On Tensor Decomposition
3	Research On Image Generation Algorithm Based On Autoencoder
4	Multi-target Localization With MIMO Radar Via Coupled Canonical Polyadic Decomposition
5	Cross-modal Multimedia Information Retrieval
6	Research On Image-Text Cross-Modal Matching Based On Attention Mechanism
7	Research And Optimization Of Neural Network Acceleration Algorithm
8	Research On Near-field Source Location Based On Tensor Decomposition
9	Research On Cross-modal Hashing Algorithm Based On Kernel Canonical Correlation Analysis And Neural Network
10	Cross-modal Music Retrieval Based On Canonical Correlation