Font Size: a A A

Study Of Extraction And Optimization Characteristic Parameters In Speaker Recognition

Posted on:2011-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:J W ZhuFull Text:PDF
GTID:2178330338483080Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Speaker recognition has become a hot research topic in the field of speech signal processing. The key points of a speaker recognition system are feature and recognition algorithm. In this paper, attention is paid on the former, i.e., feature extraction and optimization. The mainly contributions are as follows:a. Voice activity detection (VAD) in strong noise environments is improved by an algorithm based on subband reprocessed spectrum entropy (BRSE). A new voice/noise discrimination algorithm is proposed by combining the finite state machine (FSM) with BRSE. The misdetections caused by using single-threshold are reduced greatly. Experimental results show that the proposed algorithm has higher accuracy and stronger robustness than other two methods.b. The following three disadvantages of power spectrum reprocessing (PSR) method in pitch detection are observed: half pitch error and double pitch error for transition sound; low robustness for the noise speech; the method of judging voiceless and voiced speech is more complex. A series of improvements methods are proposed: nonlinear processing at time-domain; windowed filtering at frequency-domain; simplification the method of judging voiceless and voiced speech. Experiment results are shown that the improved method detect pitch trajectory more clearly and accurately than AMDF method and PSR method.c. The following four features are selected: the Mel Frequency Cepstral Coefficient (MFCC) parameters based on the characteristics of human hearing, the pitch contour based on the physiological characteristics of pronunciation features, the pitch first-order difference and the pitch changed rate. The experimental results show that the recognition rate of the proposed system is improved 2%-3% than that of speaker recognition system using the MFCC parameters only.d. In order to improve the speaker recognition accuracy, a new mel-frequency cepstral coefficient parameters extraction criterion based on F-Ratio and correlated distance criterion is proposed. Two methods based on this criterion to extract MFCC parameters are given: reducing the dimensionality and improving the MFCC's discrimination using a new bandpass liftering window. Experimental results based on two languages voice database show that the recognition rate is improved by 10-15% for the reducing dimensionality method. The recognition rate using the new liftering process is improved 10-20%. In summary, the study is focus on feature extraction and optimization in speaker recognition. In the front of speaker recognition, an accurate VAD method is selected. Four joint features are selected and MFCC optimization methods are given in this paper. This work helps to improve the rate of speaker recognition and helps to the further development of feature extraction and optimization method in speaker recognition.
Keywords/Search Tags:speaker recognition, voice activity detection, pitch detection, joint features, Fisher-Ratio and correlated distance criterion
PDF Full Text Request
Related items