Font Size: a A A

Research On Speaker Recognition Base On DenseNet

Posted on:2022-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:M M SunFull Text:PDF
GTID:2518306557469364Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Nowadays,with the improvement of computer performance and the rapid development of deep learning,speech technology has gradually become popular in people's daily life.Among them,speaker recognition,also known as voiceprint recognition,is one of the important branches in the field of speech recognition.Speaker recognition belongs to biometric recognition,which can determine the identity of the speaker through voice.Compared with other biometric technologies,speaker recognition has attracted more and more people's attention due to its unique convenience,and is widely used in various fields.In this thesis,we optimize the common text-independent speaker recognition system,and effectively improves the final recognition effect of the model.The main work done in this thesis is as follows:1.Optimized for the data preprocessing stage in speaker recognition.Based on the usual endpoint detection technology,a speaker recognition system based on Dense Net-Between class learning is proposed.The specific implementation is to learn a discriminative feature space by identifying the speeches between classes.We generate the speeches between classes by mixing two speeches belonging to different categories and a random ratio.Then the mixed sound is input to the model,and the training model outputs the mixing ratio.The advantages of BC-learning are not limited to the increase in the amount of training data.BC-learning will expand the Fisher criterion in the feature space and adjust the positional relationship between various feature distributions.Experimental results show that BC-learning can effectively improve the accuracy of speaker recognition.2.Optimize the network model for speaker recognition.As the depth of the traditional convolutional neural network model increases,the feature information will inevitably be lost in the training process of the network layer.In this thesis,we propose a Dense Net-SEblock speaker recognition system,which can alleviate the impact of the problem on the recognition effect.The Multi-scale idea is introduced on the basis of the Dense Net network model,and the original network model is optimized into a multi-scale model.In addition,the SEblock network module is also introduced into the model.This module adaptively recalibrates the characteristic response of the channel by explicitly modeling the interdependence between channels.Experiments show that on the basis of a slight increase in computational cost,the Dense Net-SEblock network model can improve the recognition rate of the network model.
Keywords/Search Tags:Speaker Recognition, Deep learning, DenseNet, Squeeze and Excitation Networks, Between class learning, Multi scale
PDF Full Text Request
Related items