Research On Speaker Recognition In Conversational Speech

Posted on:2008-11-07

Degree:Master

Type:Thesis

Country:China

Candidate:D P Liu

Full Text:PDF

GTID:2178360215490253

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Speaker Recognition (SR), also called Voiceprint Recognition, is a kind of technology which is used to identify the speaker by his (her) voice. The SR technology can be widely used in speaker identification card, security, telephone shopping etc. Conversational speech is the speech that contains more than one person, such as the conference record, the telephone dialog and the broadcast news. Speaker recognition in conversational speech is to decide who is talking when. It is a difficulty in speech recognition, in which segmentation and clustering technique were used. It can be used in information indexing, speaker tracking, content extraction etc.In this dissertation, Firstly the development and application of the speaker recognition was introduced. And then the feature extraction was discussed, which includes endpoint detection, spectral analysis and phoneme duration analysis; Then the pattern match technique which contains Gaussian Mixture Model (GMM), Hidden Mark Model (HMM), Vector Quantization (VQ) and Artificial Neural Network (ANN) was discussed; Finally the MAP adaption was used. The main work is as follows:①The phoneme duration model was build to testify the usefulness of the phoneme duration for the speaker recognition. And two methods were proposed to solve the less data problem, when using a small amount of training speech data.②A method that divides the speech into a variable length in one and a half seconds was proposed. Every test segment is merged by the syllables which were detected by the endpoint detection. Because of keeping the integrity of the syllable and the suitable length of test data, it improved the speaker identification rate.③Based on the phenomenon that most speaker turns take place in the speech break, a method of identifying the head of the semantic segment, and calculating the comparability of other segments was proposed. This method can reduce the running time. And it is an effective method to run the system under some poor environments with losing small recognition rate.④The MAP method was used to adapt the GMM model in order to improve the robust of the system. And the probabilistic adaption of the recognition score was adopted, which not only shows the recognized speaker, but also gives the possibility of the recognized result. This fuzzy result shows more precise information. And the recognition rate can be improved once more, when the confidence limit was used.

Keywords/Search Tags:

Speaker Recognition, Conversational Speech, Endpoint Detection, Speaker Clustering

PDF Full Text Request

Related items

1	Analysis Of Speaker Roles For Multi-speaker Conversational Speech
2	Design And Implementation On Text-Dependent Speaker Recognition System For Short Speech
3	Research On Text-Independent Speaker Recognition
4	Research On Speaker-Independent Speech Recognition System Based On HMM
5	Speaker Recognition In Noisy Environment
6	Speaker Segmentation For Mixed Speech In Multi-person Conversations
7	Research On Speaker Adaptation In Speech Recognition
8	Research On Improved Speaker Segmentation And Clustering Algorithm
9	Research On Speaker Recognition In Noisy Environment
10	Online Dialogue Voice-based Speaker Recognition Technology Research