Speech Enhancement Based On Representation Learning

Posted on:2018-01-24

Degree:Master

Type:Thesis

Country:China

Candidate:J W Wu

Full Text:PDF

GTID:2428330512494314

Subject:Circuits and Systems

Abstract/Summary:

PDF Full Text Request

Speech is one of the most important means of human-computer interaction.The intelligibility of speech is benefit for the successful progress of human-computer interaction,especially the recognition performance of speech recognition application system.Therefore,it is of great theoretical and practical value to study speech enhancement to improve the intelligibility of speech signal,and the development of speech enhancement is also a hotspot in the field of speech signal processing.The key point of speech enhancement is to find an effective representation method of speech signal.The effective representation means that the representation method can distinguish the clean speech from noisy speech,and distinguish the different signal components in the speech as well,so as to enhance the interest part of signal while suppress the noise and the part needless.In this paper,we study the speech enhancement method based on representation learning form two perspectives,adaptive dictionary learning and deep neural network.The main contents and contributions of this paper are as follows:(1)The Bayesian adaptive dictionary method based on sparse representation is introduced into the field of speech representation for the first time.The dictionary learning,sparse coefficients representation and noise variance estimation are integrated into a joint procedure of Bayesian posterior estimation using the Beta Process Factor Analysis(BPFA).The parameters are described by probability distribution,which can overcome the shortcomings of the traditional dictionary learning method which are over-dependent on the parameter setting.The experiments of speech enhancement in time domain were executed on NOIZEUS database.The ability of the method to learn dictionary and sparse representation adaptively is discussed.And the results show that the method can remove the environmental noise effectively and improve the human ear hearing experience as well without any noise variance estimation.(2)The study of deep learning shows that the adaptive dictionary method based on sparse representation is a shallow network,which can only extract the low-level features of the signal.However,the high-level features are needed to make the speech enhancement algorithm more robust.In addition,the speech signal appears a strong temporal correlation,which the current adaptive dictionary method is difficult to effectively describe.In view of this,Bidirectional Long Short-Term Memory(BLSTM)recurrent neural network is used to study the relationship between the noisy speech feature and the clean speech feature,so as to make effective use of the temporal correlation of speech signal and high-level semantic features.In this paper,the feature is Mel frequency Cepstrum Coefficient(MFCC),and the experiments were executed on the Chinese database.The results of speech recognition under noisy environment show that the speech enhancement method based on BLSTM has a good robustness background noise.

Keywords/Search Tags:

Speech Enhancement, Representation Learning, Dictionary Learning, Recurrent Neural Network

PDF Full Text Request

Related items

1	Speech Enhancement Based On Sparse Representation And Dictionary Learning
2	Dictionary Learning Algorithms And Its Application In Speech Enhancement
3	Research On The Improvement Of Speech Enhancement Algorithm Based On Sparse Representation And Dictionary Learning
4	Speech Enhancement Research Based On Small Dictionary Learning
5	Research On Speech Enhancement Algorithms Based On Deep Learning
6	Research Of Speech Enhancement Algorithm Based On Dictionary Learning
7	Speech Enhancement Based On Sparse Representation And Joint Dictionary Learning
8	A Single-channel Speech Enhancement Technology Based On Jointly Constrained Dictionary Learning
9	Extremely Low Signal-to-noise Ratio Speech Enhancement Method Based On Deep Learning
10	Research On Application Of Sparse Representation And Feature Dictionary In Speech Enhancement