Font Size: a A A

The Study Of Hierarchical Speaker Segmentation And Relative Algorithms

Posted on:2007-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:M YangFull Text:PDF
GTID:2178360182466690Subject:Computer applications
Abstract/Summary:PDF Full Text Request
Speaker recognition (SR), which identifies or verifies people by their voice, is regarded as the most natural and convenient one among the methods of biometrics. Speaker indexing can be viewed as an application of automatic speaker recognition. However, in dealing with the variety and openness of real audio environment, it still needs to solve many crucial problems, such as speaker segmentation.Aiming at the major difficulty of speaker segmentation as well as the disadvantages in current segmentation methods, namely, the lack of available information or knowledge, the influence of noise and environment, and the incompleteness of current model-based or distance-based segmentation methods, we suggested a hierarchical speaker segmentation system framework, and studied on relative algorithms under this framework. The main contribution of the work are as the followings:1. The hierarchical speaker segmentation system framework is suggested to solve three disadvantages in speaker segmentation. The hierarchical structure, as well as pre-trained models in each layer, can bring much more information available during segmentation. Voice detection and channel clustering are introduced to minimize the effluence of noisy and channel diversity. Besides, the pre-segmentation layer and the divide and rule layer can enhance the performance of segmentation layer.2. The feature distribution of speech and non-speech, and the form of change, are examined for speech detection. We presented a x~2-based audio change detection algorithm and a speech-to-non-speech decision tree, in which x~2 distribution is introduced to detect audio endpoint and to segment, and speech is detected via the well-trained decision tree classification.3. We have checked the environment channel diversity of audio and the influence of channel diversity brought into speaker recognition and speaker indexing, and then analyzed the effect and applicability of two solutions: channel compensation and channel clustering. Then the concept of anchor model is introduced into the field of channel clustering, and finally we present an anchor-model based channel clustering method.3. As an integration of works mentioned above, we have suggested a hierarchical speaker segmentation system framework, which can solve the difficulties in speaker segmentation with the help of hierarchical structure and prior knowledge.4. Suggested a pitch-based rapid speech segmentation for speaker indexing, which can segment speaker in both good accuracy and few time consuming under the ideal condition.5. Improved the anchor model method by suggesting a new metrics based on sequential comparing, which enables the anchor model method to be more accurate and robust in speaker verification.This work is supported by National Science Fund for Distinguished Young Scholars 60525202, Program for New Century Excellent Talents in University NCET-04-0545 and Key Program of Natural Science Foundation of China 60533040.
Keywords/Search Tags:Speaker Recognition, Speaker Indexing, Speaker Segmentation, Anchor Model, Speech Detection
PDF Full Text Request
Related items