Font Size: a A A

Bias Mitigation For Deep Models At The Inference Stage

Posted on:2024-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:X W DongFull Text:PDF
GTID:2568307139970769Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
With the development of deep learning,prioritizing fairness is of central importance in artificial intelligence(AI)systems,especially for those societal applications,in which discrimination is legally forbidden in some critical scenarios,e.g.,loan management systems should recommend applicants equally from different demographic groups,hiring systems should recommend applicants equally from different demographic groups,and risk assessment systems must eliminate racism in criminal justice.Existing efforts towards the ethical development of AI systems have leveraged data science to mitigate biases in the training set or introduced fairness principles into the training process.For a deployed AI system,however,it may not allow for retraining or tuning in practice with existing methods.Therefore,in this paper,some bias mitigation approaches are proposed to improve model fairness for deployed deep models at the inference stage without re-training.More specifically,in this paper,for the scenario with the label information for the training dataset,a more flexible approach is proposed,i.e.,Fairness-Aware Adversarial Perturbation(FAAP),which learns to perturb input data to blind deployed models on fairness-related features and preserve the information related to target label.In this way,the fairness of a deployed model can be improved without the need for re-training.To achieve this,a discriminator is designed to distinguish fairness-related attributes based on latent representations from deployed models.Meanwhile,a perturbation generator is trained against the discriminator,such that no fairness-related features could be extracted from perturbed inputs.Therefore,the deployed model can treat samples from different demographic groups equally.Besides,towards the scenario without the label information for the training dataset,a more flexible framework is proposed,i.e.,FDO(Fair Drop-Out),to mitigate unfairness by identifying and dropping the intermediate outputs of biased neurons.If these biased neurons are found and their outputs are dropped,the deployed model can retain fairer features,thus improving fairness.To achieve this goal,a paired-sample generator is designed to generate image pairs with different protected attributes for each given source image.Then,a bias detector is designed to trace the difference in features of image pairs and locate biased neurons that are sensitive to different protected attributes.Finally,the outputs of identified biased neurons will be dropped to produce fairer features,thus mitigating the unfairness of the deployed model.Moreover,in the paper,the proposed FAAP is changed to adapt to the scenario without the label information for the training dataset(called FAAP_FDO in the paper).Extensive experimental results on real-world image datasets demonstrate the effectiveness and superior performance of the proposed methods in improving model fairness for a deployed model at the inference phase.
Keywords/Search Tags:Deep learning model, Fairness, Adversarial examples
PDF Full Text Request
Related items