Font Size: a A A

Research On Fairness Testing Of Neural Networks Through Gradient Search

Posted on:2022-12-31Degree:MasterType:Thesis
Country:ChinaCandidate:L F ZhangFull Text:PDF
GTID:2518306776492484Subject:Trade Economy
Abstract/Summary:PDF Full Text Request
Once properly trained and fine-tuned,deep learning systems(e.g,deep neural networks)are capable of autonomous decision-making for corresponding prediction tasks,which facilitates a wide range of applications of neural networks in the industry.Before the deployment of deep learning systems,desired properties should be completely tested or even verified,and certain guarantees should be given for the satisfaction of these properties.Apart from the robustness and safety,fairness is also an important property that a well-designed deep learning system should satisfy.In order to avoid bias issues,it is crucial to evaluate and improve individual fairness of neural networks,and systematically generate test cases violating individual fairness.To discover and mitigate the discrimination exposed by neural networks,EIDIG(Efficient Individual Discrimination Instance Generator)is proposed for testing fairness of neural networks on the basis of the property that common neural networks are differentiable or almost everywhere differentiable.EIDIG adopts the gradient of model output w.r.t.input,instead of the gradient of loss function w.r.t.input,as the guidance of a two-phase search,which significantly lowers computational costs.In the global search phase,EIDIG attempts to rapidly generate a small set of diverse individual discriminatory instances as the seed inputs for the next phase with clustering algorithms.In the local search phase,EIDIG aims to identify as many individual discriminatory instances as possible in the vicinity of these discriminatory seeds.Finally,a proportion of individual discriminatory instances generated by EIDIG are sampled for retraining to effectively mitigate bias from original models.In each phase,prior information at successive iterations is fully exploited to optimize the whole search process.During global search,potential individual discrimination instance pairs are iteratively perturbed towards the decision boundary of the subject model under the guidance of gradient until the dual examples that only differ in some sensitive attributes are predicted with different classes.Momentum term is integrated into the global search,which takes into account the gradient information of several previous iterations at each search iteration.Momentum term enables to stabilize the search direction and accelerate the convergence of the global search phase.During local search,discriminatory seeds generated by global search are minimally perturbed according to attribute contribution evaluated with gradient information to keep the original predictions.In experiments,it is shown that the gradient information of successive iterations in the local search phase are highly correlated.Consequently,the update frequency of gradient and attribute contribution are reduced to significantly lower computational costs with search effectiveness maintained.The experimental results show that,on average,EIDIG generates 19.11% more individual discriminatory instances with a speedup of 121.49% when compared with the state-of-the-art method and mitigates individual discrimination by 80.03% with a limited accuracy loss after retraining.EIDIG has achieved the state-of-the-art performance for fairness testing of neural networks in terms of generation quantity,generation speed and fairness improvement.Additionally,approaches for widening the applications of EIDIG are presented.To deal with black-box scenarios,where the internal structure and weights are inaccessible,zeroth optimization techniques are utilized to estimate gradients,which transforms EIDIG into a black-box testing method.To deal with unstructured data,three approaches are proposed to flip the sensitive attributes of images or texts,including iterative adversarial attacks,generation with generative adversarial networks,and word analogy mutation.
Keywords/Search Tags:algorithmic bias, fairness testing, neural networks, test case generation, individual fairness, individual discrimination instance
PDF Full Text Request
Related items