Research On Speech Separation Algorithm Base On Microphonearray

Posted on:2020-07-13

Degree:Master

Type:Thesis

Country:China

Candidate:C Y Deng

Full Text:PDF

GTID:2428330575956400

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Speech is the most convenient and the fastest form of human communication.With the advent of artificial intelligence society,speech interaction is also the first choice for adult machine interaction.However,in real life,the background environment of speech is often complex and has a negative impact on speech quality.We often need to extract the speech we are interested in from the complex noise background and keep the fidelity of the speech as much as possible.At present,researchers have achieved more significant results,but still face the problem that the robustness of the algorithm is not strong enough and the perceived quality of the target speech is not high enough.In this thesis,we will do some in-depth research on extracting single-target speech from the complex noise background and multi-speaker separation.Firstly,the speech separation of single target source in complex noise background is studied.In the presence of noise,especially in low SNR scenarios,the performance of generalized cross-correlation(GCC-PHAT)based on phase transformation is seriously degraded,which seriously affects the separation performance of generalized cross-correlation-nonnegative matrix factorization(GCC-NMF).In response to this situation,this paper proposes a new calibration function—mask-weighted GCC-PHAT(MWGCC-PHAT)and mask-weighted GCC-NMF(MWGCC-NMF),which are based on the ideal binary masking(IBM)learned by the Bidirectional Long and Short Memory Network(BLSTM).Experiments show that MWGCC-NMF can separate low signal-to-noise ratio(SNR)mixed speech with GCC-PHAT separation failure.Overall performance compared to GCC-NMF,SDR increased by 25.44%,PESQ increased by 14.75%,OPS increased by 9.80%,and SNR increased by 6.38%.It is proved that MWGCC-PHAT has better robustness and performance.Secondly,speech separation of multi-speakers is discussed.Because GCC-NMF can't separate the defects of mirror-symmetric or approximate symmetry of different sources,sensitive to position information,etc.,a GCC-NMF based on Logistics regression selection strategy is proposed,which enriches the space of circular six-microphone array.The information and the GCC-NMF calculations are small and flexible.The experimental results show that the GCC-NMF based on the logistic regression selection strategy has better performance than the worst-pair microphone pair GCC-NMF,whether it is the simulated microphone array data or the real microphone data.The average OPS of the GCC-NMF based on Logistic regression selection is the highest compared to the worst performing microphone pair in the microphone array by 27.47.It is proved that the GCC-NMF of the logistic regression selection strategy greatly improves the spatial robustness and practicability of GCC-NMF.

Keywords/Search Tags:

speech separation, IBM, MWGCC-NMF, Logistic regression, selection strategy

PDF Full Text Request

Related items

1	Design And Implementation Of Click Through Rating System Based On Logistic Regresion Model
2	Design And Implementation Of Content Click Through Rate Prediction System Based On Logistic Regression With Elastic Net
3	Multi-speaker Speech Separation Based On Deep Learning
4	Research On The Prediction Of Insurance Payment Based On Logistic Regression Model
5	Speech Signal Enhancement And Recognition Algorithm
6	Deep Neural Network-based Acoustic Signal Synthesis And Separation Research
7	Study On The Speech Enhancement Method Of The Multiple Speech Signals Separation
8	Research On Multi-Speaker Speech Separation And Speech Recognition In Noisy Environment
9	The Study Of A Prediction Method For Search Ad CTR Based On Logistic Regression Model
10	Rsearch And Implementation Of Single Channel Speech Separation With Unknown Number Of Speakers