Font Size: a A A

Monophonic Speech Separation Based On Computational Auditory Scene Analysis

Posted on:2020-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:L N ZhangFull Text:PDF
GTID:2438330626453262Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Based on Computational Auditory Scene Analysis,the method of monophonic speech separation uses the auditory perception phenomenon of human ear as a clue,and simulates the process of human ear perception of target speech by computer to achieve the separation of monophonic speech,so as to extract and separate the interested speech from the mixed signal in a single channel.At present,speech separation based on CASA has become the mainstream method in this field.Based on the theory of computational auditory scene analysis,this paper studies the separation of monophonic multi-speaker mixed speech.The specific contents are as follows:(1)This thesis proposes a method for accurately estimating the pitch period.Speech signals have short-term stationarity,and harmonics with the same or similar pitch periods tend to be combined into the same source in the time-frequency plane.Therefore,the continuity of the pitch period trajectory indicates that the voiced sound of the segment is from the same sound source,and the method of extracting the pitch period by using the continuous pitch period trajectory is reliable and robust.According to the continuity of the pitch period,the pitch period spectrum is drawn by using the cepstrum,and the continuous pitch period trajectory is automatically extracted on the pitch spectrum to estimate the pitch period.As the same time,the false cepstrum peak deviating from the real trajectory is excluded according to the pitch period trajectory.(2)The method of speech separation and reconstruction of voiced segments is studied.The voiced sound has quasi-periodic characteristics,and its spectral distribution not only has regularity,but also has obvious harmonic structure.Therefore,the harmonic structure is used to separate the voiced sound.Taking the pitch period trajectory as a clue,the pitch period and the pitch frequency are reciprocal relations.The comb filter is used to obtain the spectrum of each harmonic,and the voiced sound is reconstructed by inverse Fourier transform.(3)This thesis discusses the influence of different types of noise on pitch period trajectory of pure Chinese speech under different SNR conditions.The target speech is extracted by signal-to-noise separation experiment,and the robustness and reliability of the pitch period estimation method based on pitch period trajectory as clue are proved.(4)The separation method of mixed speech when two people speak at the same time is studied.The reason why the partial target speech includes the interference sound of another person after the two people's mixed speech separation is analyzed,and the noise is eliminated by reducing the auditory shielding effect.
Keywords/Search Tags:Computational auditory scene analysis, Speech separation, Single channel, Pitch period, Separation of signal and noise, Multi-speaker
PDF Full Text Request
Related items