Font Size: a A A

Design And Implementation Of Speaker Diarization System

Posted on:2015-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2298330467995256Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Speaker diarization aims at analyzing the audio files which contains multiple speakers. The first task is to determine the location where the speaking conditions, i.e. the speakers or the speech and non-speech, are changed. The second task is to clustering the segments by their identities, namely, the segments which contain same speaker will be labeled with same label. Under the condition that the number, the identity, the gender and the speaking conditions of speakers are unknown in the audio file, it relates to the problem of determining "who spoke when?"Speaker diarization technology has a wide application prospect. For example, for audio data, such as broadcast news, meeting recordings and telephone conversations, this technology can detect and track specific speakers’voice segments, then extract the rich transcription from the large amount of audio dataset effectively. Speaker diarization system mainly consists of speech feature extraction, speech detection, speaker change points detection and speaker clustering module. In this paper, the following aspects are emphasized:(1)Summarize current developing situation and basic technologies related to speaker diarization system.(2)Analyze the E-HMM based speaker diarization system which is proposed by LIA-University of Avignon, find the shortcoming of LIA system.(3)Add a speaker purification module in LIA system for the purpose of improving the LIA speaker diarization (LIASD) system.(4)Design a multi-module speaker diarization (MMSD) for the purpose of implementing a more effective speaker diarization system.(5)Research and design on speech detection, speaker change detection, speaker clustering, speaker purification and short segments post-processing.(6)Compare and analyze the performance of LIASD system, Purified-LIASD (P-LIASD) system and MMSD system, evaluate the advantage and the disadvantage of these systems, use diarization error rate (DER) to judge the accuracy and robustness of these systems.
Keywords/Search Tags:speaker diarization, speaker change points detection, speaker clustering, speaker purification
PDF Full Text Request
Related items