Research On Voiceprint Recognition Based On Speech Feature Fusion

Posted on:2021-05-29

Degree:Master

Type:Thesis

Country:China

Candidate:M X Dai

Full Text:PDF

GTID:2428330605981143

Subject:Computer Science and Technology

Abstract/Summary:

Speech is the most common information carrier for human communication,and with the development of intelligent technology,speech plays a vital role in human-computer interaction.The essence of voiceprint recognition is to use the speech to identify the speaker.As an important branch in the field of biometric authentication,voiceprint recognition is widely used in criminal investigation,human-computer interaction verification and attendance systems.The voiceprint recognition system is mainly composed of speech feature extraction and recognition model.The speech contains personal information and common information.The difference in personal information is caused by differences in vocal organs and pronunciation habits.Common information depends on speech's text.Speech feature extraction is to extract personal information in speech.Commonly used speech features include Mel-Frequency Cepstrum Coefficient(MFCC),Linear Prediction Cepstrum Coefficient(LPCC),Perceptual Linear Prediction(PLP)and spectrograms.As for recognition model,the Gaussian mixture model(GMM)has excellent recognition performance and is widely used.In recent years,Convolutional Neural Network(CNN)has been introduced into voiceprint recognition,and has made good research progress.However,the accuracy of voiceprint recognition systems using these single features and single recognition models still cannot meet the high accuracy requirements in some fields.Based on the above research difficulties and emphases,this thesis included following parts:1.In order to solve the problem of insufficient accuracy of single features in recognition,this thesis used Fisher criteria to filter the dimensions of commonly multi-dimensional speech features.Besides,single dimensional features such as pitch frequency and spectral centroid were added to obtain new fusion feature parameters.We used GMM as a recognition model for comparative experiments.The accuracy of PLP-LPCC-PF-SC reached 94.37%,which was improved by 6.92%,12.79%and 13.58%compared with the traditional PLP,LPCC and MFCC features.2.In order to solve the problem of insufficient accuracy of single recognition model,this thesis proposed a model combination method.By analyzing the judgement of the GMM based on voting.The training samples were used to obtain the threshold parameters,and the threshold parameters were used to design the segmentation function to combine the two GMMs.Make a decision on the recognition result of the first GMM through the segmentation function,filter out the wrong test samples and input it to the second GMM to recognize again.On the premise of ensuring that the two GMM input speech feature parameters are different and both have good recognition performance,the two GMM models can complement each other in recognition capability.The second GMM can correct the recognition result of the first GMM,thereby improving the overall recognition accuracy of the system.The accuracy of the GMM joint model reached 95.63%,which was 1.26%higher than that of the single GMM model.3.In order to fully show the respective advantages of different speech feature parameters and different recognition models,and complement each other in performance,this thesis used the above-mentioned fusion feature parameters to train GMM,and used the spectrogram to train ResNet,and LSTM and attention mechanism are introduced in ResNet.The two recognition models were combined,and the final accuracy reached 95.87%,which is improved compared to the two types of single models.We make experiments on the The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus(TIMIT)database and verified the effectiveness of the method proposed in this thesis.

Keywords/Search Tags:

voiceprint recognition, feature fusion, joint model, Gaussian mixture model, Convolutional neural network

Related items

1	Research On The Extraction Method Of Speech Feature Parameters In Voiceprint Recognition
2	The Method Of Voiceprint Confirmation With Prevent Audio Replay
3	Research On Voiceprint Recognition System Based On GMM
4	Research Of Voiceprint Recognition System
5	Research Of Closed-set Voiceprint Recognition System Of Text-independent
6	Research On Voiceprint Recognition System Based On Gaussian Mixture Model
7	Research On The Method Of Confirmation Of Speaker Identity Based On Voiceprint Recognition
8	Research On The Voiceprint Recognition System With Background Noise
9	Research On Speaker Recognition Based On Fusion Feature And Gaussian Mixture Model
10	Research Of Speaker Recognition Technology Based On Fusion Features