| With the development of deep learning,its related technologies begin to be used in speech recognition systems.Because deep neural networks are vulnerable to adversarial sample attack,adversarial sample attack and defense for speech recognition systems have become the focus of research in recent years.At present,most of the research on adversarial sample generation and defense is aimed at English speech recognition models.The methods of using firefly and genetics to generate adversarial sample for Chinese speech recognition models still have some problems such as slow convergence.Based on the current research status,this paper uses intelligent swarm optimization algorithm to realize the black box target-free attack on Chinese speech recognition model,and improves the target optimization function.On the premise of ensuring high audio similarity and signal-to-noise ratio,the time of generating antagonistic samples is greatly reduced.The research work and innovation of this paper are as follows:1.Using differential evolution and simulated annealing algorithm to reduce the time of generating antagonistic samples.Aiming at the problem that firefly and genetic algorithm are easy to fall into local optimal,simulated annealing algorithm and differential evolution algorithm are used to accept the characteristics of poor solution and differential variation,which can help jump out of local optimal and reduce the time of generating antagonistic samples.2.The sum of CTC loss function and edit distance is used as the objective optimization function to reduce the time of generating counter samples.The CTC loss function is not an intuitive representation of the distance between audio recognition content.To solve this problem,the sum of CTC loss function and edit distance is taken as the target optimization function.Samples close to the target can be selected quickly,thus reducing the time to generate audio adversarial samples.3.Design and implement the counter sample generation platform.We build the Deep Speech2 speech recognition model and implement the speech recognition function.Three methods of generating adversarial samples are integrated into one platform and the process of generating adversarial samples is visualized. |