Research On Sound Source Separation Technology In Speech Recognition System

Posted on:2021-01-18

Degree:Master

Type:Thesis

Country:China

Candidate:Y M Liang

Full Text:PDF

GTID:2428330614458165

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of artificial intelligence technology,speech recognition is increasingly integrated into our lives.However,the application environments of speech recognition are inevitably affected by the interference of various noises or superposition of human voices,which leads to a serious degradation to the performance of the speech recognition system.Therefore,the voice signal interfered by noise or mixed by human voice needs to be processed by the sound source separation technology and then sent to the speech recognition system.According to the type of processed mixed audio signals,sound source separation techniques can be divided into speech enhancement techniques for processing noisy speech signals and multi-speaker separation techniques for processing mixed speech signals.This thesis mainly studies the speech enhancement algorithm based on deep neural network and the multi-speaker separation algorithm based on deep clustering.The main research contents are as follows.Firstly,this thesis studies a priori signal-to-noise ratio estimator based on deep neural networks and its overall framework for speech enhancement algorithms.The enhancement effect of this algorithm depends on the accuracy of the prior signal-to-noise ratio estimation.In the case of low signal-to-noise ratio and the type of noise is indistinguishable from speech,the prior signal-to-noise ratio estimation is not accurate enough and will affect the effect of speech enhancement.To solve this problem,this thesis proposes a blind source separation algorithm that introduces independent low-rank matrix analysis before the prior signal-to-noise ratio estimator.The proposed algorithm first separates the noisy speech signal from the noise and speech,so as to improve the signal-to-noise ratio of the audio signal with speech.The experimental results show that the proposed speech enhancement algorithm obtains better speech enhancement effect than the basic algorithm in the case of low signal-to-noise ratio and some instrumental noise interference,and can improve the recognition rate and robustness of the speech recognition system.Secondly,this thesis studies the multi-speaker separation algorithm based on deep clustering.Aiming at the problem that the K-means clustering algorithm used by this algorithm is sensitive to the selection of the initial center value of the cluster,the separation effect is unstable.This thesis proposes a multi-speaker separation using a deep clustering algorithm based on hierarchical clustering.The proposed algorithm does not need to select the initial center value of the cluster.The experimental results show that the proposed algorithm has higher separation performance and stability than the basic deep clustering algorithm,and it can reduce the word error rate of the speech recognition system.

Keywords/Search Tags:

sound source separation, speech recognition, speech enhancement, multi-speaker separation

PDF Full Text Request

Related items

1	The Research Of In-Car Speech Enhancement Algorithm Based On Blind Source Separation
2	Study On The Speech Enhancement Method Of The Multiple Speech Signals Separation
3	Multi-speaker Speech Separation Based On Deep Learning
4	Study On Speech Enhancement And Separation
5	Research On Speech Preprocessing Of Speech Recognition For Multi-talker Conversations In Complex Acoustic Environments
6	Research On Auto-regressive Deep Neural Networks' Based Monaural Speech Separation
7	Study On Speech Separation And Speech Enhancement Methods
8	The Study Of Dual-channel Speech Separation Technology For Smart Mobile Devices
9	Underdetermined Source Separation And Its Application To Speech Processing
10	Short Speech Speaker Recognition Method Based On Deep Learning And Its Application In Speech Separation