Font Size: a A A

Research On Key Technologies Of Voiceprint Recognition Based On Deep Learning

Posted on:2022-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2518306548499784Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the advancement of society and the rapid development of Internet technology,the application scenarios of identity authentication technology have become more and more complex.Traditional identity authentication technology can no longer meet people's actual needs.The society urgently needs a secure and convenient identity authentication technology.Voiceprint recognition technology is a type of biometric technology.Compared with traditional identity authentication technology,it has higher security and convenience.Compared with other biometric technologies such as face recognition and fingerprint recognition,its application cost and privacy infringement are Lower and easier to be accepted by people,so voiceprint recognition technology is a promising identity authentication technology.Although the voiceprint recognition technology has many advantages,the traditional voiceprint recognition technology has problems such as complex realization process and low recognition accuracy.In recent years,with the rapid development of artificial intelligence,deep learning methods are gradually replacing traditional statistical learning methods in the field of voiceprint recognition with their superior performance,and become a research hotspot in the field of voiceprint recognition.This paper mainly studies the voiceprint recognition method based on deep learning,introduces the spatio-temporal fusion feature extraction method and channel attention mechanism into the voiceprint recognition field and proposes three voiceprint recognition methods,namely the voiceprint based on ResNet-GRU Recognition method,voiceprint recognition method based on ECA-DenseNet and voiceprint recognition method based on ECA-DenseNet-GRU.The innovations of this article are as follows:(1)This paper proposes a voiceprint recognition method based on ResNet-GRU.Considering that the voiceprint feature is essentially a time series data,some methods only use CNN to extract its spatial features in the feature extraction process,which has certain defects.At the same time,when using RNN to directly perform feature extraction on longer time series data,the convergence speed during training is usually slow due to the high complexity of the model.Therefore,this article combines the advantages of CNN and RNN.First,the residual network(ResNet)is used for the voiceprint feature to extract high-level features,and the size of the feature map is reduced while obtaining the spatial features,and then the gated recurrent neural network is further used(GRU)Extract time series features from the feature map.The experimental results show that the recognition ability of the voiceprint recognition method based on ResNet-GRU proposed in this paper is significantly better than that of the baseline method,and there is also a large performance improvement compared to the voiceprint recognition method that only extracts spatial or temporal features.(2)This paper proposes a voiceprint recognition method based on ECA-DenseNet.In general,when CNN is used to extract spatial features of voiceprint features,a deeper feature map will be obtained,and the contribution of each channel in the feature map to the voiceprint recognition process is different,so this paper uses ECA-Net channel attention mechanism redistributes the weight of each channel of the feature map.At the same time,considering that the dense convolutional neural network(DenseNet)is more closely connected between different layers than ResNet,and it encourages feature reuse and reduces the amount of model parameters,this paper considers further adopting DenseNet as the spatial feature extraction of voiceprint features.The internet.The final experimental results show that the voiceprint recognition method based on DenseNet has higher recognition performance than the voiceprint method based on ResNet,and it takes up less disk space;the recognition ability of the voiceprint recognition method based on ECA-DenseNet proposed in this paper is also Significantly better than the baseline method.(3)This paper proposes a voiceprint recognition method based on ECA-DenseNetGRU.The experimental results show that the voiceprint recognition method based on ECA-DenseNet-GRU proposed in this paper is far superior to the baseline method,and compared to the voiceprint recognition method based on ResNet-GRU and the voiceprint recognition method based on ECA-DenseNet.There is also a big improvement in performance.Subsequently,this paper further improves the training method of the voiceprint recognition method based on ECA-DenseNet-GRU,and uses the additive angle interval loss function(Arc Face)as the cost function of network training,which further improves the recognition performance.Compared with the baseline method,the three voiceprint recognition methods proposed in this paper have improved recognition performance.They provide new ideas and solutions for short-speech voiceprint recognition technology,and have good academic research value.
Keywords/Search Tags:voiceprint recognition, spatiotemporal fusion, GRU, DenseNet, attention mechanism
PDF Full Text Request
Related items