Research On Speech Spoofing Detection Based On Attention Mechanism And End-to-End Model

Posted on:2022-10-29

Degree:Master

Type:Thesis

Country:China

Candidate:L C Huang

Full Text:PDF

GTID:2518306572991469

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of deep learning,speech generation technologies such as speech synthesis and speech conversion have become more mature.They have been able to generate very natural,smooth,and realistic speech.Although speech depth generation technology brings diversified entertainment methods to people's lives,it also brings great security threats to the automatic speaker recognition system.Therefore,academia began to design special ASV spoofing detection systems.However,the current neural network adopted by the recent speech spoofing detection algorithms are all classic structures optimized for images,and the effect is not ideal.To effectively detect the artifact in the speech signal,this paper makes a series of optimization and improvement on the network structure according to the characteristics of the spoofing speech signal.The main research contents are as follows:1.Given the poor performance of traditional convolutional networks that are not suitable for ASV spoofing detection tasks,this paper optimizes and improves the network structure from three aspects: frequency,channel and time domain.Traditional convolutional networks cannot capture the inter-harmonic correlation of fake speech frequencies;as the number of layers increases,the number of channels becomes too large,and there is a certain amount of information redundancy;the final global average pooling it easy to lose helpful information.In response to the above problems,this paper proposes a frequency attention module,a channel attention module,and a time self-attention layer to optimize the extracted acoustic features to obtain a more discriminative acoustic feature.2.Given the information loss in the traditional artificially designed acoustic feature extraction process,and a single feature cannot detect multiple forgery attack algorithms simultaneously,this paper has tried to solve the spoofing detection problem in an end--to--end manner.By analyzing the calculation process of the Fourier transform in speech signal processing,a method of time-frequency conversion using time-domain convolution is proposed,and the feasibility of this method is verified through experiments;For issues such as the time-domain convolution parameters are too many and inability to learn effective filters,the SINC function in SincNet is introduced to construct a bandpass filter convolution with only two learnable parameters,which reduces the parameters and improves performance;In addition,inspired by RawNet2,this paper proposed to replace the 2d convolutional network residual block with the 1d time-domain convolution residual block and use the recurrent neural network GRU to model the frame-level features.3.Through many comparative experiments on the ASVspoof2019 LA data set,the effectiveness and feasibility of the method proposed in this paper are shown.The EER of the optimal algorithm based on the attention model in this paper is 1.87%,which surpasses all known single-system models.

Keywords/Search Tags:

Deep learning, Automatic speaker recognition, ASV spoofing detection, Attention mechanism, End-to-end model

PDF Full Text Request

Related items

1	Research On Deep Learning Based Speaker Recognition Algorithm
2	Speaker Recognition Method Based On Deep Learning
3	Text Independent Speaker Recognition Based On Deep Learning Framework
4	Deep Learning As A Speaker Spoofing Countermeasure
5	Spoofing Speech Detection Research
6	Design And Implementation Of Face Recognition System Based On Deep Learning
7	Research On Face Anti-Spoofing Algorithm Based On Deep Learning
8	Research On Voice Transformation Spoofing Detection Algorithm And Implementation Of Robust ASR System
9	Research On Text-independent Speaker Recognition Based On Attention Mechanism
10	Speaker Verification And Anti-Spoofing Attacks Technology Based On Deep Learning