Font Size: a A A

Wireless Intercom Audio Of Speaker Segmentation And Clustering Research

Posted on:2017-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z P XiaoFull Text:PDF
GTID:2308330503958230Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the arrival of the information explosion and the big data era, the way and quantity of the audio acquisition is increasing rapidly, the management of audio is becoming more and more complex. The speaker segmentation and clustering as a way of audio management gradually become a hot research topic in recent years in the world. Speaker segmentation and clustering are the pre-processing and necessary steps for speaker identification,speaker tracking and speaker adaptations.In this paper for walkie-talkie radio system’s speech with multi-speakers, channel change and large noises, we focus on speaker segmentation and clustering. First we analysis of the existing unsupervision algorithms of speaker segmentation and use the BIC, GLR and KL2 criterion for speaker segmentation, through moving time window of different scales to quickly determine the speaker change point and the best performance of F is 65.47%. Speaker clustering is achieved by the CLR distance of bottom-up hierarchical clustering, the termination conditions of clustering is the specified number of speakers in the system. Because of the radio voice intercom large noises, classical spectrum subtraction and Wiener filter are used for the speech enhancement. We do the speaker segmentation and clustering experiments for the enhanced speech and compare the performance.For the performance bottleneck of above algorithm, based on our observations and analyses of the radio walkie-talkie audio, push-to-talk click is generated when talks are finished between land commander and pilots. Due to push-to-talk clicks are a sign of the speaker change point, speaker segmentation technique infused with acoustic event information is proposed in this paper. On the basis of investigating the existing event detection algorithm, we use time domain analysis and model matching method to carry out the acoustic event detection. After the event location with high recall rate and high accuracy rate, the speaker segmentation results are corrected. The final performance of value F of the proposed method can be boosted up to 77.18% with a relative increment of 17.88%. Meanwhile, the recall rate and accuracy rate are improved by 20.01% and 15.50%.
Keywords/Search Tags:Information fusion, Speaker segmentation and clustering, event detection
PDF Full Text Request
Related items