Font Size: a A A

Text-independent Speaker Recognition Research Based On Local Acoustic Features

Posted on:2017-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:J M GuoFull Text:PDF
GTID:2358330512467944Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the development of technology in the computer field, there have been developed a variety of new technologies and new products. The most useful product is the one who can reduce the gap between man and machine, In other words, we need the product which can make human and machine communicate with each other easily. The product and human also can learn from each other, which can lead the machine have a self-optimization, while human can learn much knowledge from the product. To achieve this desire, we must make our computer to recognize the speaker's speech. And this work was not so easy to complete, because we cannot be completely surrounded by quiet environment. There always exist many kinds of noises around us. So the percentages of recognition for different speakers will decline. To solve this problem, we should extract the efficiently differentiate features from the speaker's speech and design a good model to recognize different people. This paper proposes a different method to extract discriminative feature in the speech based on a same identification method is the same. In addition, we have introduced the feature extracted method in detail while mentioned identification method only. Generally, the speaker recognition can be divided into text-dependent and text-independent recognition. The different speaker systems mainly extract the characteristics (the vector and personality characteristics of the human-related) who can extract good distinguished features. In addition to the formants in different speaker's speech, LPCC and MFCC are the most commonly adopted acoustic features. The researchers also proposed generalized synchronization detector, local normalized cepstral coefficients, discrete wavelet transform and wavelet packet transform. These methods always used in the recognition of speech samples in more than poor single sample voice recognition results. And this time the proposed method can be used for a single speaker voice samples to identify by the proposed feature extraction method, the method is to extract the main component local voice frequency characteristics near the critical point. To demonstrate the advantage of the proposed method, we have designed several comparative experiments. To make a better understand of this article, we introduced some key algorithms in the second chapter. According to the local feature of speech based on time and frequency distribution, we proposed a text-independent extraction method in single training sample. This method extract the local feature of the speech spectrum, the features have a good robust on white noise, Gaussian noise and pink noise. It can also reflect the speaker's basic sound characteristics. For the basic characteristics of the local features, this paper presents Bayesian decision method to the local feature. The English and Chinese speech database simulation results show that this method can be implemented in speaker recognition, the recognition accuracy of a single training sample was significantly higher than MFCC and LPCC features, at the same time,it has good robust to the noise. This robustness has been proved by mathematical methods. In all experiments, we use the same speech database (standard TIMIT database and Chinese database recoded by myself) and the same noise experiment. Since there are five feature extraction methods, we have 25 experiments, each method should be carried out two experiments in pure speech database, the next will be added separately three noise conduct experiments, then observe the different identification results. Experimental results show that the accuracy of recognition algorithm of this paper is higher than the most common conventional MFCC algorithm and LPCC algorithm, also higher than the improved local normalized cepstral coefficients and wavelet packet transform method.
Keywords/Search Tags:Speaker recognition, Single training sample, Feature extraction, Local spectrogram features, Bayesian decision
PDF Full Text Request
Related items