Font Size: a A A

Research And Application Of Voiceprint Recognition Based On Deep Learning

Posted on:2024-02-29Degree:MasterType:Thesis
Country:ChinaCandidate:F F DengFull Text:PDF
GTID:2568307100995409Subject:Master of Electronic Information (Professional Degree)
Abstract/Summary:PDF Full Text Request
In recent years,the increasing demand for accuracy in identity recognition and verification has brought new research directions and challenges to enterprises and academic researchers in related fields.Compared with earlier biometric identification such as face recognition and fingerprint recognition,traditional identification methods have disadvantages such as low security,not easy to remember and easy to be attacked,which bring users a poor experience.Voice print has unique advantages,so voiceprint recognition technology has a greater prospect in the current identity verification scenario.However,because the accuracy of voice recognition in practical application scenarios needs to be improved,and its results are easily affected by the environment and noise,the application of this technology has not yet been promoted and popularized.With the further development of deep learning,the performance and accuracy of voiceprint recognition systems have been improved to a certain extent,and the main work on voiceprint recognition is focused on the directions of feature extraction,model structure and prediction evaluation.The specific research of this paper content and innovative aspects are outlined below:(1)Acoustic feature extraction network is the focus of acoustic recognition research,and attention mechanism is very popular in the current deep learning research.In this paper,we also study the feature extraction network for acoustic patterns,and improve on the Res Net50 model by adding CBAM,an attention mechanism containing convolution modules,to the network structure of the residual block.CBAM uses both average pooling and maximum pooling,which reduces the information loss from pooling to some extent.After adding the CBAM block,the new feature map of the Res Net50 model is able to obtain attention weights in both channel and space dimensions,which improves the correlation of acoustic features in channel and space and helps to extract effective features,which in turn improves the performance of the feature extraction network,and the improved residual convolutional network improves the recognition accuracy through experimental comparison with other models.(2)For model training optimization,Arc Face,a loss function,is used as the classifier.Using Arc Face function can map the feature vectors of time-frequency images to spherical space,so as to better distinguish different sound signals,and the classification effect of this loss function is better when the number of classifications in the dataset exceeds three thousand.(3)Based on the research of acoustic models,this paper applies voiceprint recognition technology to identity authentication,and adds a new identity authentication method for login of current B/S architecture systems or platforms,which uses the voice of system users to verify whether the system has the legitimacy of use.This not only brings convenience and improves user experience,but also enhances the security of system usage to a certain extent.
Keywords/Search Tags:Voiceprint Recognition, Residual Network, CBAM, ArcFace Loss Function
PDF Full Text Request
Related items