Font Size: a A A

Design And Research Of Audio CAPTCHA Based On Adversarial Attack

Posted on:2022-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z N YuanFull Text:PDF
GTID:2518306602992889Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
CAPTCHA is a kind of completely automated public program used to distinguish whether the user is a computer or a human.Over the years,CAPTCHA,as the main defense mechanism to prevent malicious programs from attacking public systems,has received extensive attention and research,and its forms are also varied.Audio CAPTCHA,as one of the most important forms of CAPTCHA,plays an important role in the field of CAPTCHA as it provides an effective means of testing for visually impaired users.However,in recent years,with the rapid development of machine learning technology,most of the existing audio CAPTCHAs can be successfully attacked by machine learning-based speech recognition algorithms,which means that they have security risks.Similarly,the expansion of the application field of deep learning technology has also promoted the popularization of voice applications,making the audio CAPTCHAs no longer just a verification mechanism limited to special groups.Based on the above two reasons,the main research content of this paper is to design a more secure and usable countermeasure audio verification code.The main research work can be divided into the following two parts:(1)An algorithm based on generative adversarial network is proposed to generate corresponding adversarial audio CAPTCHA for different types of audio data sets,which can resist the attack of target speech recognition model.The algorithm uses a generator to learn the synthetic perturbation,and then combines the synthetic perturbation with clean audio as the raw adversarial audio.A discriminator is used to identify whether the audio CAPTCHAs are clean samples or adversarial samples.At the same time,the integration of two mainstream deep learning-based speech recognition models,Deep Speech and Lingvo,enables adversarial audio to not only mislead their transcription,but also improve their transferability.In addition,hinge loss and regularization techniques are used to improve the perceptual quality of audio CAPTCHAs.Because the algorithm is designed based on generative adversarial network,it is suitable for batch generation of adversarial samples and application scenarios requiring real time.(2)Verify the effectiveness,transferability and universality of the algorithm.The algorithm is first used to generate the corresponding adversarial samples for a representative reCAPTCHA V2 audio CAPTCHAs.The experimental results show that the recognition accuracy of the target speech recognition model for the generated adversarial sample is reduced to 0%.Moreover,for the two target speech recognition models,the SNR between the original samples and the corresponding adversarial samples is 15.54 d B and 18.11 d B respectively,and STOI score is 0.92 and 0.95 respectively.By comparing with previous excellent research work,it shows that the adversarial audio verification code generated by this algorithm can resist the target speech recognition model and has acceptable audio perception quality.Secondly,the effectiveness of the algorithm is tested on a digital audio CAPTCHAs,which shows that the algorithm can be applied to different types of audio CAPTCHAs.Then,two target recognition models,Deep Speech and Lingvo,were integrated and trained to prove that the integration strategy could improve the migration of the adversarial samples generated by the algorithm.Finally,the proposed algorithm is applied to Librispeech audio dataset with sufficient data volume to prove the ability of the proposed algorithm to learn a general generation model.
Keywords/Search Tags:audio CAPTCHA, reCAPTCHA, adversarial examples, Generative Adversarial Networks, Speech Recognition Model
PDF Full Text Request
Related items