Three-channel Speech Separation Based On Computational Auditory Scene Analysis

Posted on:2017-02-24

Degree:Master

Type:Thesis

Country:China

Candidate:Y Wang

Full Text:PDF

GTID:2308330503482509

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

In recent years, speech signal separation has received extensive attention, and many scholars have researched on the basis of the analysis of the auditory scene deeply. There are mainly pitch estimation, blind signal separation and sound source localization method in the speech separation. the article mainly aims at the sound source localization to separate the speech. In the original dual channel speech separation technology proposed a three-channel speech separation method, and the specific content is as follows.Firstly, the article simulates the original dual channel speech separation system, by mixing target speech information with other disturbances, noisy,and then simulate separation environment; the mixed signal is processed by peripheral auditory processing method, get frequency unit; calculate interaural time difference and interaural intensity difference which are compared with a threshold value, and generate the auditory masking matrix based on the inverse synthetic. Experiment shows that when the sound source position is very close, the separation effect is not obvious, and the evaluation criterion which includes separation gain and the similarity is low.Secondly, on the basis of the dual channel speech separation, an array of elements is added and form the three element array, and then the auditory model is processed. The simulation results show that the proposed method can reduce the noise and the other speech signal, and it can improve the separation gain and the similarity, but increase the amount of computation.Again, due to the environmental noise on the speech signal separation, the three channel speech separation is affected seriously,and then introduce a noise reduction process to improve it, namely before a mixed voice peripheral auditory processed, the speech conduct empirical mode decomposition(EMD) and auditory peripheral pre-processing. The experimental results show that the method can improve the speech signal separation gain in noise environment, but to the speech signal separation similarity is not improved obviously.Finally, the paper summarizes,prospects the article, and points out the shortcomings.

Keywords/Search Tags:

Computational auditory scene analysis, Three-element array, Empirical mode decomposition, Interaural time difference, Interaural intensity difference

PDF Full Text Request

Related items

1	Mixture Speech Separation Based On Computational Auditory Scene Analysis
2	Perceptual Measurement And Research In The Effect Of Interaural Time And Level Differences To The Acoustic Localization
3	Construction And Device Research Of Visual Surrogate Model Based On Lidar
4	Effects Of Interaural Time Difference On Speech Intelligibility Of Hearing-impaired Persons In Noisy Environment
5	Measurement And Analysis Of Perceptual Characteristic Of Interaural Level Difference
6	Sound Source Separation Of Multi-voice Environment Based On Auditory Central Nervous System
7	Audio-visual Underdetermined Blind Speech Source Separation
8	Research And Implementation Of Surround Sound Signal Detection System Based On Sound Reflection
9	Measurement And Research On Relative Contribution Of Frequency And Parameter Values To Selectivity For Interaural Correlation
10	Research On Speech Enhancement Based On Computational Auditory Scene Analysis