Speaker Verification Based On The Glottis Information

Posted on:2015-01-22

Degree:Master

Type:Thesis

Country:China

Candidate:Q F Luo

Full Text:PDF

GTID:2268330428464409

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Speech is the most important and normal method in the communication between differentpersons. And getting the information of a speaker from the speech signal is Speaker recognition,also known as voiceprint recognition. It is one of the key technologies in speech signal processing.With the rapid development of intelligent computing and network security requirements, thevoiceprint recognition technology is attracting more and more attention and is heading for practicalapplicationAfter years of research, speaker recognition system is getting mature in lab environment.However, there are some problems to be solved while speaker recognition is used in real-world.Problems mainly concentrated on the calculation efficiency and the robustness of speakerrecognition system.Essentially, speaker recognition system can be roughly divided into featureextraction and pattern recognition. The classic text-independent speaker recognition system use Melcepstral as the characteristic parameter and UBM-MAP-GMM model for processing. AlthoughUBM-MAP-GMM system model considers the mismatch problem between training speech and testspeech,but in practice, the requirement of computation and storage is still large, robustness is stillwaiting to improve. this paper,studies how to extract and fuse different types of information inspeech signal, introduces the physical meaning of system, to reducing the mount of calculation andenhance the robustness of voiceprint recognition system.The main contents of the dissertation are as follows:1. Introduce physical meaning of the Gaussian mixture model, and describe some improvement ofUBM-MAP-GMM, analyze the classical model when lacking of phonemes class in the trainingspeech, and then propose a speaker verification system model based on the selection of theGaussian component. Experiments show that the improved speaker verification system has acertain improvement in both the training time and equal error rate.2. The short-term features MFCC reflect the vocal tract characteristic of the speaker, whileprosodic features based on the pitch and the frame energy reflect glottal information of speaker.They describe the speaker in different angles, so we can fusion them to improve the systemperformance. This paper presents a feature fusion method based on secondary judgment,experiments show that this method has some improvements in system performance.3. Both glottal information based on energy and MFCC can reflect the characteristics of a person.But because of different levels they describe, the problem of mutual interference exists. In thispaper, the disturbance of the glottis pulse airflow was eased by removing the glottis pulseairflow information from MFCC, and then the performance of the system is improved.

Keywords/Search Tags:

selection of the Gaussian component, prosodic features, secondary judgment, glottisinformation

PDF Full Text Request

Related items

1	The Research Of High-level Information Fusion Based Speaker Recognition Algorithm Using Short Utterance
2	Chinese Prosodic Phrases Recognition Based On Syntax And Dependency
3	Chinese Prosodic Phrases Based On Text And Phonetic Features Boundary Prediction
4	Age Speech Conversion Based On Spectrum And Prosodic Features
5	Partitioning of prosodic features for audio similarity comparison
6	SVM Speaker Verification Based On Prosodic Feature
7	Automatic Personality Estimate,Recognition And Application Research Based On Chinese Prosodic Features
8	Research On Cigarettes Brands Classification And Authentication Based On Image Features
9	Pronunciation Evaluation Using Short And Long-term Features
10	Research On Predicting Chinese Prosodic Boundary Based On Syntactic Features