Font Size: a A A

Theory Research And Application Of Speaker Recognition Based On Fuzzy Clustering And Genetic Algorithms

Posted on:2008-08-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:L LinFull Text:PDF
GTID:1118360212497943Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Speaker recognition is a research field that uses the information extracted from speech signal to recognize different speakers automatically. With the development of computer, signal processing and speech technology, speaker recognition as a biology identification technology, plays an important role in many areas. The practical development of speaker recognition makes system with the little training data become a key factor in application. For system with little training data, eapecially with less than eight seconds training data, how to establish an effective speaker model to achieve a good performance, is a goal of this dissertation. Fuzzy clustering analyses use the uncertainty, that the data belongs to each category, to describe the median quality of data membership. Thus, this method has become an important and efficient tool in pattern recognition field. Therefore, in this dissertation, it studied several speaker recognition algorithms based on fuzzy clustering analyses firstly. Genetic algorithm is an adaptive global probabilistic search algorithm. It simulates the biologic heredity and evolution process in the natural environment, and provides a common structure to solve the optimization problem in a complex system. It is not only independent of the problem idiographic field, but also has strong robustness to the problem category. Therefore, it has been used in many subjects, and become one of the powerful ways to solve the global optimization problem. Gaussian mixture model is one of the popular speaker models in speaker recognition. It can approximate to any shape distribution smoothly, and can get better performance than vector quantization algorithm, when the training data is enough. But it is sensitive to the initial model parameters and easy to lead to a sub-optimal model in practice. So in the second part of the dissertation, it used genetic algorithm to optimize the Gaussian mixture model, so that it can improve the perfoemance of speaker recognition system based on Gaussian mixture model.The creative work of this dissertation includes five aspects:1. To solve the problem that fuzzy clustering is sensitive to initial value, it proposed a genetic-fuzzy clustering speaker recognition algorithm. The problem brought form the initial value setting unsuitle, has been solved effectively. The algorithm proposed can make speaker recognition with little (less than 8s) training data become possible.2. Fuzzy clustering analysis is only suitale to describe the hypersphere data, while it can't give a precision describtion for the complex speech features. So it proposed a speaker recognition method based on fuzzy kernel vector quantization to describe the feature distribution, which provies a novel method for speaker recognition with little training data. Besides, in order to remove the influence of the fuzzy weighted exponent m, it proposed a fuzzy kernel vector quantization speaker recognition algorithm based entropy regulation. Entropy regulation method can increase the whole system performance, especially with little code volume, the system performance increased significantly.3. As for how to achieve the high performance of speaker recognition with little (less than 8s) training data, it proposed a fuzzy kernel algorithm with discriminative method for speaker recognition. The unique discriminative characteristic of different speech signal parts among the different speakers had been used effectively. The proposed method builds a theory basis for the practice use of speaker recognition.4. To avoid Gaussian mixture model (GMM) sensitive to initial value, it proposed a genetic-fuzzy Gaussian mixture model optimization method, and applied this method in speaker recognition system. It used the global convergence of genetic algorithms and the strong local search characteristic of fuzzy clustering, to improve the optima hitting rate, which can make speaker recognition based on GMM get better performance.Fifth, as to the premature convergence and weak exploitation capabilities of genetic algorithms, it proposed a speaker recognition method based on adaptive niche algorithms. Niche technique and maximum likelihood estimation were used to form a new hybrid structure, amd improve the two problems in the genetic algorithms. Besides, it used the other speakers'discriminative information integrated into fitness function to increase the accuracy of classification, which provide a effective method for the optimition of Gaussian mixture model.This dissertation is divided into eight chapters:In chapter one, it summarized current research state and the significance of speaker recognition, showed the development of fuzzy clustering analysis and genetic algorithm, and presented the main problems in speaker recognition based on fuzzy clustering and genetic algorithm. Finally, it determined our study in this dissertation.In chapter two, it introduced some foundational knowledge which would be used later, such as basic principle of speaker, fuzzy theory and fuzzy clustering analysis, genetic algorithm theory. In chapter three, it studied a speaker recognition method based on genetic-fuzzy clustering analysis. Fuzzy clustering analysis is a new method used in the speaker recognition. But it is sensitive to the initial model parameters, and easy to lead to a sub-optimal model in practice, which cause the training model to describe the feature distribution imprecisely, and to decrease the system performance. In order to solve this problem, it proposed a genetic-fuzzy clustering analysis speaker recognition algorithm. Genetic algorithm was used in the model training, and maximum overall average membership function rule was used to identify the unknown speech. Besides, it also used an adaptive parameter strategy to improve the exploitation capabilities of GA. Experimental results show that this algorithm can not only cluster the complex Gaussian data reasonably, but also can improve the system performance. Especially, when the little training data is used, the system performance is better than that of Gaussian mixture model. In chapter four, it studied little training data speaker recognition based on the kernel method. Firstly, it used fuzzy kernel clustering to design vector quantization, and used the fuzzy-kernel vector quantization to train the speakers'models. By non-linear mapping, the data in original space were mapped to a high-dimensional feature space. Then it used the fuzzy clustering to the speakers'training features in the feature space, and formed the speaker's model with the clustering centers. The recognition was performed in the high-dimensional feature space. Because of the kernel mapping, the features inherent in the speech explored, which improved the discriminations of the different speakers. With the reasonable kernel width, the performance of speaker recognition system is further improved. In order to get rid of the influence of the weighed exponent m, it used entropy concept to define the fuzzy entropy objective function in the feature space, and proposed a fuzzy kernel vector quantization speaker recognition algorithm based entropy regulation. Besides, it proposed a fuzzy entropy degree update method based on simulated annealing, and discussed influence of the initial value and the lower limit of the fuzzy entropy degree. Entropy regulation method can increase the whole system performance, especially with little code volume, the system performance increased significantly.In chapter five, it studied a fuzzy kernel algorithm with discriminative method for speaker recognition. Among the different speakers, the different parts of speech signal have the unique discriminative characteristic. It used this characteristic to define a fuzzy weighted objective function in the feature space, and proposed a novel weights assignment method. This method assigned the lager weights for code vectors with higher discriminative power. Thus, the codebooks and weights formed the speakers'database. In the matching phase, it proposed a fuzzy kernel weighted nearest prototype classifier, which can identify different speakers in the high dimensional space. The experimental results show that the algorithm in this chapter can get better identification result than fuzzy vector quantization and fuzzy vector quantization based entropy regulation speaker recognition. With sixty-four code volume, seven seconds training data, and one second identification data, the system error rate can be 2.7 percent. In chapter six, it studied a Gaussian mixture model training method based on genetic algorithms and fuzzy approach. It applied genetic algorithm and the fuzzy Gaussian mixture model estimation method to optimize the model parameters globally, and improve the parameters precision. It used fuzzy minimum objection function algorithm as a hybrid operation, to re-estimate the model parameters. Adaptive mixture crossover rate and mutation rate of the GA are also used to improve the sub-optimal search ability and the convergence speed of the GA. Using the above algorithm in the the speaker training stage, the system performance can be improved.In chapter seven, it studied speaker recognition based on adaptive niche hybrid genetic algorithms. It utilized the niche techniques and ML algorithm in the genetic algorithms (GA) training step and provided a new architecture of hybrid algorithm. The new hybrid algorithm can reduce the possibility of premature convergence presence and improve the exploitation capabilities of GA. It also used an adaptive updating strategy to control the GA mixture crossover rate and mutation rate. Besides, the other speakers'discriminative information was integrated into fitness function to increase the accuracy of classification and make Gaussian mixture model (GMM) more generalization ability. In the speaker recognition experiments, it obtain more optimum GMM parameters and better results than the traditional and the two improved versions for speaker recognition.In chapter eight, a brief summary of this dissertation was given. Some problems and further research aspects were presented.
Keywords/Search Tags:speaker recognition, little training data, fuzzy clustering, kernel method, genetic algorithms, niche technique, Gaussian mixture model
PDF Full Text Request
Related items