Font Size: a A A

Research On Voiceprint Recognition Based On Speech Feature Fusion

Posted on:2021-05-29Degree:MasterType:Thesis
Country:ChinaCandidate:M X DaiFull Text:PDF
GTID:2428330605981143Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Speech is the most common information carrier for human communication,and with the development of intelligent technology,speech plays a vital role in human-computer interaction.The essence of voiceprint recognition is to use the speech to identify the speaker.As an important branch in the field of biometric authentication,voiceprint recognition is widely used in criminal investigation,human-computer interaction verification and attendance systems.The voiceprint recognition system is mainly composed of speech feature extraction and recognition model.The speech contains personal information and common information.The difference in personal information is caused by differences in vocal organs and pronunciation habits.Common information depends on speech's text.Speech feature extraction is to extract personal information in speech.Commonly used speech features include Mel-Frequency Cepstrum Coefficient(MFCC),Linear Prediction Cepstrum Coefficient(LPCC),Perceptual Linear Prediction(PLP)and spectrograms.As for recognition model,the Gaussian mixture model(GMM)has excellent recognition performance and is widely used.In recent years,Convolutional Neural Network(CNN)has been introduced into voiceprint recognition,and has made good research progress.However,the accuracy of voiceprint recognition systems using these single features and single recognition models still cannot meet the high accuracy requirements in some fields.Based on the above research difficulties and emphases,this thesis included following parts:1.In order to solve the problem of insufficient accuracy of single features in recognition,this thesis used Fisher criteria to filter the dimensions of commonly multi-dimensional speech features.Besides,single dimensional features such as pitch frequency and spectral centroid were added to obtain new fusion feature parameters.We used GMM as a recognition model for comparative experiments.The accuracy of PLP-LPCC-PF-SC reached 94.37%,which was improved by 6.92%,12.79%and 13.58%compared with the traditional PLP,LPCC and MFCC features.2.In order to solve the problem of insufficient accuracy of single recognition model,this thesis proposed a model combination method.By analyzing the judgement of the GMM based on voting.The training samples were used to obtain the threshold parameters,and the threshold parameters were used to design the segmentation function to combine the two GMMs.Make a decision on the recognition result of the first GMM through the segmentation function,filter out the wrong test samples and input it to the second GMM to recognize again.On the premise of ensuring that the two GMM input speech feature parameters are different and both have good recognition performance,the two GMM models can complement each other in recognition capability.The second GMM can correct the recognition result of the first GMM,thereby improving the overall recognition accuracy of the system.The accuracy of the GMM joint model reached 95.63%,which was 1.26%higher than that of the single GMM model.3.In order to fully show the respective advantages of different speech feature parameters and different recognition models,and complement each other in performance,this thesis used the above-mentioned fusion feature parameters to train GMM,and used the spectrogram to train ResNet,and LSTM and attention mechanism are introduced in ResNet.The two recognition models were combined,and the final accuracy reached 95.87%,which is improved compared to the two types of single models.We make experiments on the The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus(TIMIT)database and verified the effectiveness of the method proposed in this thesis.
Keywords/Search Tags:voiceprint recognition, feature fusion, joint model, Gaussian mixture model, Convolutional neural network
PDF Full Text Request
Related items