Font Size: a A A

Speaker Recognition With Emotional Speech

Posted on:2020-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:Ahmad Faraz HussainFull Text:PDF
GTID:2428330590961609Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
An integrative technology,Speaker recognition manipulates vocal features of speakers to infer information about their identifications.It is the biometric branch that is used for identification,verification and categorization of particular speakers,with the ability of detection,tracing and partition by extension.Speaker recognition is the only biometry that is simply checked(verified/identified)remote via the existing infrastructures i.e.Mobile network and phone network.This makes recognition of speakers very important and with the increasing number and complexity of cellular(mobile)telephones,recognition of speakers will become more popular in the future.Speaker recognition can be potentially applied to many applications like access control,transaction authorization over mobile phone and identification of forensic suspect by his/her voice.Other biometrics require special acquisition hardware but speaker recognition needs only a microphone.Despite the fact that speaker recognition research has been ongoing for extra than four decades,the performance of speaker recognition is effected by person health,age,background noise and the speaker emotional state.So as to build an emotional speaker recognition system,this paper uses Kaldi GMM-I-vector Toolkit to design an emotional speaker recognition that is tested in clear and noisy environments.The main work and contribution of this article are the following:1.In recognition of speaker's field,I-Vector has being proved to be very efficient because of it fixed length and low dimensional feature vector.I-Vector approach will be used for emotional speaker recognition on text-dependent database in clear and noisy environments.The databases contains six different emotions like sad,angry,fear,happy,neutral and disgust.Kaldi offer CMVNs(cepstral mean variance normalization),use to better normalization of MFCC features.Whereas for the testing and training system,the Gaussian Mixture Models are used.2.For channel/session compensation,linear discriminant analysis(LDA),probabilistic linear discriminant analysis(PLDA)and within – class Covariance Normalization(WCCN)are proposed.EER is used for performance evaluation.
Keywords/Search Tags:Speaker recognition, MFCC, GMM, I-Vector, PLDA
PDF Full Text Request
Related items