Font Size: a A A

Speaker Recognition Technology Based On Auditory Feature Parameters

Posted on:2017-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:B F XiongFull Text:PDF
GTID:2348330485965205Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Language is the most commonly used approach of information communication.It is natural and convenient, accurate and efficient. With the development of technology, it is very important to study speech signal processing technology by modern means to promote the progress of human computer interaction technology.Speech signal processing is an interdisciplinary science combined phonetics with digital signal processing and it keep in close contact with many frontier subjects in information science.Speech signal processing can be divided into three directions: speech recognition,speech emotion recognition, speaker recognition. Speech recognition is to let the machine recognize and understand the human speech signal into the corresponding text or order. Speech emotion recognition is a kind of emotion recognition by the machine from the speech signal. Speaker recognition is an automatic identification technology of speakers' identity through the analysis of speech signal. The main research direction of this paper is the speaker recognition.Speaker recognition is also known as voiceprint recognition which is an important branch of speech signal processing. One of the most critical steps of speaker recognition is to extract the feature parameters that have been a hot issue for a long time by researchers. Based on current research status of speaker recognition, this paper made a research work as follows:(1) The front end of speech signal is required to add window frame, but it is inclined to cause the problem of spectrum leakage, which is not conducive to get accurate characteristic parameters. A hamming convolution window with excellent performance indicators than traditional window function is put forward and applied to speech signal preprocessing instead of the traditional window function. It can restrain the spectral distortion.(2)The current mainstream speaker recognition system adopts linear prediction cepstrum coefficients and Mel frequency cepstrum coefficients, even though the system has achieved great recognition rate under high SNR environment, the system recognition rate decreased rapidly under low SNR environment. For the purpose of solving this problem, this paper presents all pole gammatone filter based on auditory characteristics according to human ear's auditory system works. That is more in line with human ear asymmetric filtering property, experiments verify the betterrobustness of hearing characteristic parameters.(3)The current feature parameters are extracted by Fourier transform to acquire frequency domain information of speech. However, The Fourier transform has a single resolution, there is no advantage for long time non-stationary speech signal.Considering that wavelet transform has superiority with multi-resolution analysis, this paper adopts the cochlear filter cepstrum coefficient to speaker recognition system by auditory transform. Experiments prove that the cochlear filter cepstrum coefficient has a better robustness and anti-noise performance than classical characteristic parameters.Giving a further research on Mel, Bark, ERB scale domain, simulation experiments are carried out and the results show that ERB scale cochlear filter cepstrum coefficient is superior than Bark Scale.
Keywords/Search Tags:Hamming Self Convolution Windows, All Pole Gammatone Filter, Wavelet Transform, Cochlear Filter Cepstrum Coefficient
PDF Full Text Request
Related items