Font Size: a A A

Research On Acceleration Method Of Speaker Identification

Posted on:2022-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:X Y YuanFull Text:PDF
GTID:2518306485459324Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Speaker identification needs to cross all the voiceprint models of the voice to be recognized with all the speaker models in the identification set to generate a identification list,and then according to the generated identification list,the voiceprint model of the voice to be recognized and the speaker's voiceprint model are processed.The likelihood score is calculated,and then the likelihood scores are sorted to find the highest or top few likelihood scores.However,performing speaker identification in this way will cause the overall calculation time to be too long,because each identification requires a certain amount of time.When the number of speakers is too large,the likelihood of each voice to be recognized is fully crossed with all speakers.Will cause a lot of computational overhead,and time efficiency is low.This paper explores and researches the acceleration methods for speaker identification irrelevant to the closed set of texts,and proposes two methods for accelerating speaker identification:One is to use a two-layer reference speaker model,and use a certain clustering algorithm to cluster the target speakers in the model library to obtain multiple clusters of cluster centers,that is,the reference speaker model,which has the same reference speaker model.Common features of speaker models in the group.During identification,the likelihood scores of the speech to be recognized and all speaker models are calculated,and the group with the highest similarity is selected based on the likelihood scores for identification,and finally the speaker model that is most similar to the speech to be recognized is found.This process reduces the overall number of identifications and achieves the purpose of accelerating speaker identification.The second is to perform binary encoding of the speaker representation while preserving the personality characteristics of the speaker.The original calculation of voice features and speaker feature likelihood scores in the streaming space is changed to the calculation of the likelihood scores in the binary Hamming space.When calculating the likelihood,the approximate nearest neighbor search algorithm is used to find the speaker representation in the Hamming space.The measurement methods are the Hamming distance and the earth movement distance.Because the likelihood calculation in the binary linear space is faster than the normal streaming space There are more,which is equivalent to reducing the time for each identification,and the overall speaker identification efficiency is also greatly improved.The experimental results prove that the calculation methods proposed in this paper that use clustering and binary methods to optimize speaker identification can effectively shorten the identification time.
Keywords/Search Tags:Speaker Identification, Likelihood Score, Reference Speaker Model, Clustering, Binary Coding
PDF Full Text Request
Related items