Font Size: a A A

The Research Of Speech Segregation Based On Computational Auditory Scene Analysis And Microphone Array

Posted on:2016-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:X C ZhaoFull Text:PDF
GTID:2308330479451062Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Speech segregation technology based on computational auditory scene analysis has a wide range of application in many fields, such as artificial intelligence, machine perception and automatic speech segregation. At present, domestic and foreign scholars dedicate to the research of speech segregation in noisy environment, in which the automatic segregation of speech and noise is the most difficult one. Multi-speaker’ mixing speech segregation system is difficult to obtain satisfactory results, mainly because of the situation which the voiceless and voiced separation can not be taken into account comprehensively. The paper based on this research selects interaural time difference and interaural intensity difference as the sound segregation clues, among them how to obtain the mask matrix based on masking threshold has done key research and exploration.Firstly, the paper introduces the theory of computational auditory scene analysis in detail, simulates and analyzes the speech separation algorithm which bases on the ITD and IID, then finds the deficiencies which you can not arbitrarily chosen target speech existed in the algorithm.Secondly, based on the existing theory for its shortcomings, this paper presents an improved scheme of narrowing the scope of screening. Simulation results show that the optimized algorithm can optional separate the two-way voice signals from different directions, improves the separation performance of the system and eliminate the limitations that system only can separate the sound source of relatively small delay.Thirdly, this paper use the effective time frequency section after segregating to synthesize sound signal, and then consider the system form temporal distortion, segmented SNR, signal waveforms and supervisors hearing four aspects, which verifies the validity of the improved method proposed in this paper.Finally, during the process of screening effective time frequency section according masking threshold, this paper introduces three parameters of controlling range to optimize the segregation performance of the system. Then starting from the three parameters, this paper study their importance in the system respectively, and do a large number of system simulation experiments to measure their impact on the whole system of segregation by changing their values, and then determine the optimum values for each parameter.
Keywords/Search Tags:speech segregation, CASA, auditory masking, microphone array, ITD, IID, speech synthesis
PDF Full Text Request
Related items