Research On Short-speech Speaker Verification Method Based On Multi-branch Aggregation Network

Posted on:2022-04-11

Degree:Master

Type:Thesis

Country:China

Candidate:Y Q Yang

Full Text:PDF

GTID:2518306572960029

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Speaker verification is a technology to determine whether a certain speech comes from a given speaker.With the rapid development of the Internet and the widespread popularity of mobile devices,it has become easier to collect a person’s voice data,which greatly facilitates and promotes the research of Speaker Verification technology.After decades of development,although the technology has made considerable progress,speaker confirmation under short speech conditions is difficult to extract sufficient speaker distinguishing information due to short data and few speaker identity information.In turn,it affects the scoring and discrimination of the model and the overall recognition effect of the system.Therefore,short-speech speaker confirmation is still a challenging task.Aiming at the short-speech speaker verification problem,the research content of this article mainly includes the following aspects:(1)A speaker embedding feature extraction method based on Multi-Branch Aggregation(MBA)network is proposed.In view of the fact that it is difficult to extract sufficient speaker identity information for a single-channel system,based on the Time-Delay Neural Network(TDNN),the Large＿TDNN(L＿TDNN)network which increases the number of nodes and delay value and the Small＿TDNN(S＿TDNN)network which reduces the number of nodes and delay value form a multibranch structure,extracting more the features of each channel are then used to aggregate the multi-branch results through the pooling layer and then using feature splicing.Experimental results show that this method achieves better performance than the baseline system in the test speech.(2)A speaker embedding feature extraction method based on Multi-Branch and Multi-Scale Aggregation(MBMSA)network is proposed.In view of the problem of information loss in the process of feature transmission from the lower layer of the network to the upper layer in each individual channel,the information lost during the transmission process needs to be retrieved,and the information of the lower layer network can be used as much as possible during each feature transmission.It is retained,so the multi-scale aggregation method that can achieve the above requirements is adopted in the multi-branch network to further improve the performance of the algorithm.The implementation of this method needs to reflect the diversification of scales between different network layers,so a multi-branch multiscale aggregation network is constructed using residual networks(ResNet)based on Convolutional Neural Networks(CNN).The experimental results show that the proposed multi-branch and multi-scale aggregation network can achieve better results on short-speech speaker verification problems.

Keywords/Search Tags:

Speaker Verification, Short Speech, TDNN, MBA, MBMSA

PDF Full Text Request

Related items

1	Speaker Verification Based On Limited Speech Data
2	Automatic speechreading for improved speech recognition and speaker verification
3	Speaker Extraction And Verification Based On Deep Learning
4	Analysis Of Speaker Roles For Multi-speaker Conversational Speech
5	Research On Speaker Verification Based On Telephone Conversation
6	Short Speech Speaker Recognition Method Based On Deep Learning And Its Application In Speech Separation
7	Research Of Robust Speaker Verification Baesd On Deep Learning
8	Research On Speaker Recognition Over Short Utterance And Varying Channels
9	The Hht Transform In Speaker Recognition
10	Discriminative and generative approaches for long- and short-term speaker characteristics modeling: Application to speaker verification