Font Size: a A A

Cross-Modality Image Reconstruction And Recognition Using Deep Learning

Posted on:2021-08-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:B CaoFull Text:PDF
GTID:1488306050963929Subject:Intelligent information processing
Abstract/Summary:PDF Full Text Request
Image is an important source of information for humans.With the development of science and technology,a variety of image sensors have been developed.Therefore,there are various image modalities,such as forensic sketches used in law enforcements,magnetic resonance images(MRI)used in clinical practice;near-infrared images used in security fields such as access control systems;thermal infrared images used in life detection.Images from different sensors or different imaging environments(such as face photos taken under visible light conditions,hand-drawn sketch portraits,near-infrared/thermal-infrared face images captured by infrared imaging equipment,cross-modality MRI images,computed tomography/MRI images are all called cross-modality images.The great differences between cross-modality images bring great challenges to crossmodality image reconstruction and recognition.Existing methods cannot effectively address these problems and reach the requirements of many practical applications.Therefore,this paper is devoted to the task of image cross-modality reconstruction and recognition.Taking deep learning as the theoretical framework,a series of new methods for image crossmodality reconstruction and recognition are proposed.The core contributions of this dissertation are summarized as follows:1.A deep information fusion-based image cross-modality reconstruction method is proposed.Existing image cross-modality reconstruction algorithms are mostly limited to the scale of cross-modality images,which results in poor image reconstruction performance and is difficult to reach the requirements of practical application scenarios.To overcome this problem,a deep information fusion-based image cross-modality reconstruction method is proposed.First,we utilize different recognition models to supervise the image reconstruction network separately.Due to the different structure and pre-trained data of different recognition models,they can bring great intra-class diversity to the reconstructed images.Then these reconstructed images are added to the training set.The augmented dataset is used comprehensively to further optimize the reconstruction model,and finally we can achieve a clearer and more realistic image cross-modality reconstructed result.2.An identity preservation-based image cross-modality reconstruction method is proposed.During the image cross-modality reconstruction process of existing methods,only the pixel level difference between the generated image and the real image is considered in the training phase.The feature level and the semantic level difference are not taken into consideration,which results in great deformation,blurring details,and weak semantic discriminative information.The perceptual appearance and quantitative evaluation scores are also poor.To address this problem,an identity preservation-based image cross-modality reconstruction method is proposed.First,the existing cross-modality face image reconstruction algorithms are used to augment the training set.Then,the cross-modality image reconstruction network is supervised by the pixel level,feature level,and semantic level information consistency.We thereby overcome the huge difference between cross-modality images,because of multilevel constraints.Finally,the intra-domain adaptation network is used to further optimize the details of the reconstructed images.3.A self-representation collaborative learning-based image cross-modality reconstruction method is proposed.Existing methods can only translate images from one to the other modality,which cannot effectively utilize the complementary information of multiple modalities,resulting in a waste of effective information and poor accuracy of image reconstructed results.To handle this problem,we proposed a self-representation collaborative learning-based image cross-modality reconstruction method,which comprehensively utilizes all available information correlated to the target modality from multi-source-modality images to generate any missing modality in a single model.Different from the existing methods,we introduce an autoencoder network as a novel,self-supervised constraint,which provides target-modality-specific information to guide generator training.The proposed method can generate more accurate results.4.A data augmentation-based joint learning method for image cross-modality recognition is proposed.Existing methods cannot effectively learn discriminative information from the small-scale cross-modality image databases,which results in poor recognition accuracy.To address this problem,we propose an asymmetric learning method for cross-modality image recognition.The proposed method mutually transforms the cross-modality differences by incorporating synthesized images into the learning process.The aggregated data augments the intra-class scale,which provides more discriminative information.However,this strategy also reduces the inter-class diversity(i.e.,discriminative information).We develop the data augmentation-based joint learning model to balance this dilemma.Finally,we obtain the similarity score between cross-modality face image pairs through the log-likelihood ratio.We achieve great performance on viewed sketch database,forensic sketch database,nearinfrared image database,thermal-infrared image database,low-resolution photo database,and image with occlusion database.5.A multi-margin-based decorrelation learning method for image cross-modality recognition is proposed.Existing methods did not take the redundant information between crossmodality images into consideration,which results in poor recognition and verification accuracies.To handle this problem,we proposed a multi-margin-based decorrelation learning approach consists of a heterogeneous representation network and a decorrelation representation learning model.First,we employ a large scale of accessible visual face images to train a heterogeneous representation network.The decorrelation layer projects the output of the first component into the decorrelation latent subspace and obtains decorrelation representation.In addition,we design a multi-margin loss,which consists of quadruplet margin loss and heterogeneous angular margin loss,to constrain the proposed framework.The proposed model can effectively improve the recognition and verification accuracies between near-infrared and thermal-infrared images.
Keywords/Search Tags:Cross-modality image, information fusion, identity preservation, self-representation collaborative learning, asymmetric learning
PDF Full Text Request
Related items