| With the rapid development of the big data era and the popularization of smart terminals,voiceprint features are widely used in various authentication tasks due to their advantages of easy collection,high robustness and resistance to forgetting.However,as a biometric feature,it is unique and difficult to be reversed once it is stolen.Among the privacy protection works in recent years,the Voice Privacy Challenge(VPC)proposes a synthesis framework and evaluation system for anonymous speech based on n eural network technology which has attracted more and more scholars’ attention.However,under the idea of guaranteeing the maximum difference from the original voiceprint,the problem of similarity of anonymous voiceprint to each other and concentration o f distribution exists in the VPC 2022 baseline system.In practical multi-service provider-oriented scenarios,it is beneficial for malicious servers to use the anonymous voiceprints registered in them to initiate counterfeit authentication to neighboring servers.On the other hand,the anonymization effect is enhanced with the problem of reduced auditory recognition from the original voiceprint and increased word error rate.Therefore,starting from the problem of distribution concentration while consideri ng balancing the privacy preservation and multiplicative utility of anonymous voiceprints,the main work of thesis is as follows:(1)A StarGan-based algorithm for anonymous voiceprint generation is proposed.The generation domain is divided by cosine clu stering on the anonymity pool,so that the similarity of voiceprints varies little within the domain and much outside the domain.The model structure and hierarchical parameters of StarGan are redesigned for the one-dimensional fixed-length ordered feature of the generation task.When using StarGan to generate the original voiceprint into different generative domains,anonymous voiceprints with varying degrees of similarity to the original voiceprint can be obtained.The experiments show that the distributi on of generated voiceprints in the high similarity interval [1,0.9] decreases by 22.75% from 100%;the average similarity between generated voiceprints and the target domain center of mass is 32.25%higher than that of the non-target domain.(2)An exponential mechanism-based privacy-preserving method for voiceprint generation is proposed.Based on the anonymous voiceprint generation algorithm,we propose a personalized privacy protection method that assigns different privacy protection degree generation do mains according to the confidence level,and use a differential privacy index mechanism to perturb the correlation between the confidence level and the privacy protection degree around the risk of conspiracy inference attack.Through the index mechanism,personalized privacy protection is achieved while blocking complicit inference attacks.The experiments show that,on the one hand,as the privacy budget increases,the proportion of generated voiceprints in the high similarity interval[1,0.9] increases in crementally;on the other hand,compared with the low privacy budget(ε = 0.01),the average EER and CLLR of high privacy budget(ε= 20)decrease by 64.31% and 90.89%,respectively,and the average linkability of feature templates increases by 53.31%,and the personalized privacy protection effect significantly. |