Font Size: a A A

Speaker Recognition Based On Affinity Propagation Clustering And Ensemble Learning

Posted on:2020-08-19Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhangFull Text:PDF
GTID:2428330575490144Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Speaker recognition technology is more natural than other biometrics technology,but from the technical maturity,speaker recognition technology is still in the development of biometrics technology.At present,there are two technical difficulties in speaker recognition technology.From the perspective of feature extraction,due to the variability of human voice system,it is necessary to extract speaker's voiceprint features from a large number of voice samples.The existence of a large number of redundant samples brings great difficulties to training classification model.From the point of view of recognition methods,the generalization ability of speaker recognition system using single classifier is weak,and the classification accuracy of classifier is low.In view of the above problems,this paper does the following research:(1)Speech signal feature extraction methods for different application environments.In the real environment,there are different levels of noise interference in the process of speaker utterance,and the correct speech features for different noise environments can achieve effective representation of speaker characteristics.Through the experimental comparison of real scenes,for the strong noise environment,the extraction of the Mel Frequency Cepstral Coefficients and its differential coefficient as the characteristics of the single frame signal can effectively remove the noise effect,and the representation ability of the speech signal is better.For the weak noise environment,the Power-Normalized Cepstral Coefficients and its differential coefficient as the characteristics of a single frame signal can better represent the speech signal under the same computational complexity.(2)Voice print feature sample selection method based on Affinity Propagation(AP)clustering.Aiming at the problem that the large number of speaker voiceprint samples leads to high training cost of classifier.AP clustering is used to cluster the feature sample set of single frame signal.Prototype samples have typical representativeness in their clusters.Cluster prototype samples are used to represent similar samples,and redundant samples are deleted to achieve speech feature sample selection.The comparative experiments on real data sets show that the proposed method can effectively compress the training sample set on the basis of ensuring the recognition accuracy,and the compression rate can reach 85.19%-92.95%,thus greatly reducing the training cost of classifiers.(3)The method of constructing speaker recognition system based on ensemble learning.Aiming at the weak generalization ability of single classifier speaker recognition model,in this paper,a subset of training samples is sampled by multiple random sampling,and a number of BP neural network classification models are established by using parameter perturbation strategy of random BP neural network algorithm,and the final classification results are determined by voting method.The experimental results show that the method effectively overcomes the problem of insufficient generalization ability of speaker recognition system based on single classifier,and improves the recognition rate of the system effectively.Both the experimental results in the high fidelity AISHELL Chinese database and the experimental results in the self-collecting database collected under different noise environments show the effectiveness of the proposed method.The Affinity Propagation clustering method proposed in this paper can guarantee the representation of speaker's intrinsic features and reduce the cost of network training.The integrated learning framework based on multi-BP neural network improves the generalization ability and accuracy of speaker recognition system,and enriches the related theory and practice of speaker recognition technology.Both the experimental results of AISHELL Chinese database with high fidelity and the experimental results of self-collection database collected under different noise environments demonstrate the effectiveness of the proposed method.The affinity propagation clustering method proposed in this paper can reduce the cost of network training while ensuring the representation of the speaker's intrinsic characteristics.The proposed ensemble learning framework based on multi-bp neural network improves the generalization ability and accuracy of speaker recognition system and enriches related theories and practices of speaker recognition technology.
Keywords/Search Tags:Speaker Recognition, Feature Extraction, Sample Selection, Affinity propagation Clustering, Ensemble Learning
PDF Full Text Request
Related items