Font Size: a A A

Learning Disentangled Representations For Adversarial Image Detection Based On Causal Inference

Posted on:2024-07-06Degree:MasterType:Thesis
Country:ChinaCandidate:B YanFull Text:PDF
GTID:2568307091988159Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Deep learning has become the leading approach which is known for its flexibility,generalizability,and ability to automatically extract effective high-level features for different modeling tasks.However,the end-to-end framework of deep learning is typically data-driven only for decision making which may result in generalizability that is difficult to explain by human experts.To resolve the problem,researchers have proposed disentangled representation learning to explore and leverage the intrinsic working mechanisms as well as inherent causal relationships underlying the data.This approach entails decoupling the generative mechanisms of multi-level and multi-scale data from the perspective of data generation.The aim is to improve the generalizability,adversarial robustness,and interpretability of deep models.Recently,the literature has seen work along this line by proposing various constraints on the latent space to promote disentangled representation learning for deep models.However,these methods have significant limitations and there is room for performance improvement,for example,1)they rely on the assumption of an independent latent space.In real-world applications,however,semantic factors are often interrelated,and it is difficult,if not impossible,to disentangle them by force that result in poor model performance;2)encoding all information observed in the training data will inevitably capture spurious correlations and result in a precision loss;3)disentangled representation learning for specific downstream tasks,such as explaining the source of adversarial vulnerability,has received relatively little research attention.For instance,disentangled representation learning is still in its early stage in the field of adversarial machine learning.To address these above issues,the dissertation presents a causal perspective on disentangled representation learning.The main contributions of our work are as follows:Firstly,the dissertation propose a disentangled representation generation model based on causal relationships.In contrast to conventional methods,our approach alleviates the assumption of independent latent space and leverages the knowledge of causal discovery to explore causal relationships between latent variables.The dissertation then utilize contrastive information from pairs of images as supervisory input to the latent space,and minimize the contrastive loss to narrow the gap between latent variables and the groundtruth factors in the data.Our experiments demonstrate that this method is capable of adapting to image generation with causal relationships.Secondly,the dissertation propose a novel framework to infer the adversarial robustness based on causal relationships.Our approach investigates the connection of causal relationships and adversarial robustness in the context of an adversarial machine learning.The dissertation show from a causal perspective that adversarial training works by breaking the spurious correlation between the interfering factor of adversarial perturbations and true labels of data.Moreover,the dissertation propose to disentangle the latent influential factors of data into robust and non-robust factors and accordingly,reconstruct a causal graph between the robust factors,non-robust factors,and data labels.The dissertation show analytically that the spurious correlation between non-robust factors and true labels is the real factor that affects the discriminative robustness of deep models.Thirdly,the dissertation propose a novel algorithm for learning feature disentanglement representations with respect to adversarial image detection.This is done by introducing supervised information with pairs of image samples consisting of natural images and their corresponding attack images.The dissertation leverage the property that only adversarial perturbations cause non-robust features to differ between the image pairs.Unlike existing VAE-based methods,the dissertation propose an indirect approach with two classifiers to guide the disentanglement of deep features into robust and non-robust components,respectively.While the classification results using robust features are almost always normal regardless of the input,those based on non-robust features can flip from normal to adversarial labels.By exploiting such difference in the dual classification results,the dissertation are able to distinguish normal and adversarial samples effectively.Our experiments demonstrate that our approach successfully detects multiple categories of adversarial samples from various datasets.In conclusion,the dissertation introduce causal relationships into disentangled representation learning,which enables us to preserve causal relationships between attributes while controlling partial disentangled factors as independently as possible.The dissertation also analyze the source of adversarial vulnerability using causal knowledge and infer the spurious correlation between non-robust factors and data labels.Moreover,the dissertation propose a novel feature disentanglement representation learning algorithm and apply it successfully to perform the downstream task of adversarial image detection.Our method can detect different types of adversarial attacks,which exhibits better generalizability and transferability over previous methods that were designed for detecting a particular type of attack.When testing against the latest Adv-Makeup attacks,for example,our approach has achieved a success rate of 98.8%.It also performs well under another complex attack that combines digital adversarial perturbations with Deep Fakes with a detection performance of 97.5%,which demonstrates a superior performance comparing with other methods.
Keywords/Search Tags:Causal Inference, Feature Disentanglement, Generative Models, Adversarial Robustness, Adversarial Detection
PDF Full Text Request
Related items