Research On Model Extraction Attack Technology Based On Explainable Artificial Intelligence

Posted on:2024-08-14

Degree:Doctor

Type:Dissertation

Country:China

Candidate:A L Yan

Full Text:PDF

GTID:1528307115981799

Subject:Cyberspace security

Abstract/Summary:

PDF Full Text Request

The development and application of artificial intelligence technology have brought great reforms to people’s lives.In particular,the deep neural network has played a crucial role in promoting the transformation and upgrading of agricultural production and social security.However,the unexplainable nature of deep neural networks makes it difficult to apply to sensitive data fields.To build trust in deep neural networks and make them more meaningfully integrated into people’s daily lives,explainable technical systems are to be established.To this end,explainable artificial intelligence(XAI)has been intensively researched and developed.Although XAI alleviates the unexplainably of deep neural networks,additional explainable information poses security and privacy threats.Research on the security and privacy risks of XAI has been initially developed.However,it mainly focuses on the research of model integrity and data privacy.The research on the confidentiality of XAI models has not yet received attention.The research objective of this thesis is to explore the attack technology of model extraction for XAI.(1)To address the issue of traditional model extraction attack methods not being suitable for stealing XAI models,a universal model extraction attack method based on XAI is proposed.This type of model extraction attack method can not only attack the traditional deep neural network model but also attack the explainable deep neural network model.Experimental results verify that the Grad-CAM model explanation leads to the greatest risk of model privacy.(2)To address the issue of incomplete stealing task functionality in model extraction attack methods based on XAI,a dual-task model extraction attack method based on XAI is proposed.This model extraction attack method can simultaneously steal the prediction behavior and model explanation function of the explainable deep neural network model.Experimental results show that the dual-task model extraction attack method achieves higher fidelity.(3)A data-free model extraction attack method based on XAI is proposed to address the limitations of model extraction attacks based on XAI,which require adversaries to have more knowledge of target models.This model extraction attack method can steal the predictive behavior of explainable deep neural network models without a special collection of attack data.Experimental results show that the proposed method is higher fidelity than traditional state-of-the-art data-free model extraction attack methods.(4)To address the issue of unclear potential influencing factors in model extraction attacks,an evaluation framework for model extraction attack potential influencing factors is proposed.The evaluation framework aims to explore the potential factors that affect the model extraction attack,such as target model task accuracy,target model architecture,target model robustness,etc.Based on the evaluation experiments of the proposed framework,a series of meaningful results are obtained: using adversarial training methods to improve model robustness is more vulnerable to model extraction attacks,simple target model architectures are more vulnerable to model extraction attacks than complex target model architectures,etc.To sum up,this thesis starts with exploring the privacy leakage risk of XAI models and proposes a model extraction attack method for XAI from several perspectives.The research results have important theoretical significance and application value for promoting the development of XAI.

Keywords/Search Tags:

explainable artificial intelligence, model extraction attacks, model extraction attack evaluation

PDF Full Text Request

Related items

1	Research And Implementation Of Interpretability Evaluation Meghod Of Artificial Intelligence Model
2	Research And Implementation Of Artificial Intelligence Interpretability Evaluation Method On Rule-Based Surrogate Model
3	Research On Cotton Blending Model And System With Explainable Artificial Intelligence
4	Research On Model Extraction And Membership Inference Attacks Against Machine Learning Classification Models
5	Extracting Optimal Explanations For Ensemble Trees VIA Logical Reasoning
6	The Research Of Feature Extraction Methods And Their Applications
7	Research On Deep Model Watermarking Under Multiple Attacks
8	Based On Webdecoy Attacks Characteristics Extraction Studies
9	Design And Implementation Of APT Group Attack Technique Extraction System Based On Threat Intelligence Analysis
10	Research On Trusted Artificial Intelligence Based On Adversarial Attack Strategies