Research And Implementation Of High Recognition Rate Voiceprint Recognition Technology Based On Convolutional Neural Network

Posted on:2022-12-25

Degree:Master

Type:Thesis

Country:China

Candidate:Q Ma

Full Text:PDF

GTID:2518306764477474

Subject:Automation Technology

Abstract/Summary:

Voiceprint is unique.Voiceprint recognition technology is a hot and cutting-edge research technology in the field of biological authentication.This thesis studies the text independent voiceprint recognition technology,uses the residual convolution neural network to extract the voiceprint features,and adds the attention mechanism to improve the recognition effect.At the same time.This thesis also studies the influence of different loss functions on the field of voiceprint recognition,and puts forward an AP loss function suitable for application in this field.The thesis adopts the data augmentation strategy to further reduce the equal error rate（EER）index of voiceprint recognition system.The main work and contributions of the thesis are summarized as follows:（1）Design and improve the voiceprint recognition feature extraction module based on the original ResNet-34 model,the attention mechanism of Convolutional Block Attention Module（CBAM）and self attention pooling（SAP）coding layer are introduced into the model,and propose FRACNN2 D neural network model.Compared with the x-vector model on the VoxCeleb1 dataset,the EER index of this model is reduced by 2.01%,and the MinDCF is reduced by 0.33,and the ROC-AUC is 0.85% lower.Compared with the ResNet-34 model training on the VoxCeleb2 dataset,the EER index is reduced by 1.72%.（2）Study the influence of different loss functions on the field of voiceprint recognition,including traditional Softmax loss function and its improved AM-Softmax,Arcface loss function in different coefficients m and hyperparameters s.The performance of the GE2 E and Prototypical loss functions based on deep metric learning are also analyzed and tested,and improve the original Prototypical loss function and propose an AP loss function.The scale invariance and rotation invariance are introduced by using the cosine metric,and finally the EER index is applied to different loss functions.For performance analysis,the best performing AP loss function has a lower EER index on the VoxCeleb1 dataset up 4.57%.（3）Two types of data augmentation strategies are adopted for the FRACNN2D-AP neural network model proposed in this thesis.The first strategy is to add additive noise,including Music,Bubble,RIR,etc.,and the second is to use Spec Augment strategy for data enhancement,including Time Warp,LB,LD three strategies,two types of data augmentation strategies reduce the EER index by 0.11% and 0.19% respectively.

Keywords/Search Tags:

voiceprint recognition, convolutional neural network, attention mechanism, FRACNN2D, data augmentation

Related items

1	Research On The Method Of Voiceprint Recognition Based On Deep Neural Network
2	Research On Action Data Augmentation Strategy And Action Recognition Model Based On Skeleton Motion Sequence
3	Chinese Speech Recognition Based On Deep Convolutional Neural Network
4	Research On Convolutional Neural Network Based On Variable-Length Speech Data Voiceprint Recognition Technology
5	Research And Application On Attention-based Facial Landmark Detection Technology
6	The Application Of Convolutional Neural Networks In Voiceprint Recognition
7	Research On Voiceprint Recognition Model Based On Deep Learning
8	Environmental Sound Recognition Based On Deep Learning
9	Research On Classification Method Of Hyperspectral Image Based On Improved 3D Convolutional Neural Network
10	Research On Key Technologies Of Voiceprint Recognition Based On Deep Learning