Research On Speaker Recognition Base On DenseNet

Posted on:2022-09-19

Degree:Master

Type:Thesis

Country:China

Candidate:M M Sun

Full Text:PDF

GTID:2518306557469364

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Nowadays,with the improvement of computer performance and the rapid development of deep learning,speech technology has gradually become popular in people's daily life.Among them,speaker recognition,also known as voiceprint recognition,is one of the important branches in the field of speech recognition.Speaker recognition belongs to biometric recognition,which can determine the identity of the speaker through voice.Compared with other biometric technologies,speaker recognition has attracted more and more people's attention due to its unique convenience,and is widely used in various fields.In this thesis,we optimize the common text-independent speaker recognition system,and effectively improves the final recognition effect of the model.The main work done in this thesis is as follows:1.Optimized for the data preprocessing stage in speaker recognition.Based on the usual endpoint detection technology,a speaker recognition system based on Dense Net-Between class learning is proposed.The specific implementation is to learn a discriminative feature space by identifying the speeches between classes.We generate the speeches between classes by mixing two speeches belonging to different categories and a random ratio.Then the mixed sound is input to the model,and the training model outputs the mixing ratio.The advantages of BC-learning are not limited to the increase in the amount of training data.BC-learning will expand the Fisher criterion in the feature space and adjust the positional relationship between various feature distributions.Experimental results show that BC-learning can effectively improve the accuracy of speaker recognition.2.Optimize the network model for speaker recognition.As the depth of the traditional convolutional neural network model increases,the feature information will inevitably be lost in the training process of the network layer.In this thesis,we propose a Dense Net-SEblock speaker recognition system,which can alleviate the impact of the problem on the recognition effect.The Multi-scale idea is introduced on the basis of the Dense Net network model,and the original network model is optimized into a multi-scale model.In addition,the SEblock network module is also introduced into the model.This module adaptively recalibrates the characteristic response of the channel by explicitly modeling the interdependence between channels.Experiments show that on the basis of a slight increase in computational cost,the Dense Net-SEblock network model can improve the recognition rate of the network model.

Keywords/Search Tags:

Speaker Recognition, Deep learning, DenseNet, Squeeze and Excitation Networks, Between class learning, Multi scale

PDF Full Text Request

Related items

1	Single Image Rain Removal Based On Squeeze-and-Excitation Networks
2	Spatiotemporal Squeeze-and-Excitation Residual Multiplier Networks For Video Action Recognition
3	Research And Implementation Of Multi Speaker Recognition Technology Based On Deep Learning
4	Research On Object Detection Method By Parallel Connecting Deep-shallow Layers With Squeeze-and-excitation
5	Person Re-identification Based On Multi-scale Feature And Siamese-GAN Network
6	Research On Key Technologies Of Speaker Recognition Based On Deep Learning
7	Research On Deep Learning Based Speaker Recognition Modeling
8	The Application Of Speaker Recognition Technology Based On Deep Learning
9	Text Independent Speaker Recognition Based On Deep Learning Framework
10	Research And Implementation Of Deep Belief Networks Based Speaker Recognition