Font Size: a A A

Blind Speech Deconvolution And Room Impulse Response Modeling Based On Sparse Representation

Posted on:2019-03-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:J GuanFull Text:PDF
GTID:1368330566998333Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The aim of blind speech deconvolution is to recover both the original speech source and the room impulse response(RIR)from the observed reverberant speech.This can be beneficial for several applications,such as automatic speech recognition(ASR),hands-free devices,and hearing-aids.In addition,the recovered RIR can be used for the applica-tions,such as sound reproduction and speech enhancement,etc.However,blind deconvo-lution is an ill-posed and under-determined problem.To address the problem,additional prior information is often required.In this dissertation,we focus on the blind speech de-convolution problem for single-input and single-output(SISO)room acoustic system.We start our work by exploiting the prior knowledge from the RIR.In order to capture the prior information for blind speech deconvolution,we use two type of methods to exploit the prior information from RIRs,by imposing constraints on RIR and RIR modeling.The main contributions of this work are given as follows.Firstly,we study the blind speech deconvolution problem for sparse acoustic system.A sparse blind speech deconvolution model is introduced for the acoustic system with low reverberation.In the proposed model,not only the sparsity of the acoustic system is considered,but also the dynamic range of the reverberant speech is taken into account to reduce the possible solution space for the estimation of RIR and source speech.To address the problem,an l1 norm blind speech deconvolution method is proposed.As demonstrated in our experiments,the proposed method provides superior performance for deconvolution of a sparse acoustic system,as compared with the state-of-the-art methods.Moreover,the results also demonstrate that the estimated results can attenuate the scale ambiguity problem by using the dynamic range regularization.Secondly,we study the blind speech deconvolution problem for the acoustic system with high reverberation.A sparsity and density joint regularization model is proposed for the blind speech deconvolution of such acoustic system.In the proposed model,both the sparsity and density of the RIR are considered,by imposing an l1 and an l2 norm constraint on early and late part of the RIR respectively.To address the problem,we propose a joint l1-l2 regularization based blind speech deconvolution method.As demonstrated in our experiments,by employing the proposed method,both the source signal and early part in the RIR can be well reconstructed while the late part of the RIR can be suppressed by controlling the l2 norm penalty parameter.Thirdly,we study the RIR modeling problem to exploit the acoustic characteristics provided by RIRs.As the existing dictionary learning algorithms are developed mainly for standard matrices(i.e.matrices with scalar elements),and little attention has been paid to polynomial matrices,despite their wide use for describing convolutive signals or for modeling acoustic channels in room and underwater acoustics.We propose a polynomial dictionary learning technique to deal with signals with time delays.Here,we present two types of polynomial dictionary learning methods based on the fact that a polynomial matrix can be represented either as a polynomial of matrices(i.e.the coefficient in the polynomial corresponding to each time lag is a scalar matrix)or equally as a matrix of polynomial elements(i.e.each element of the matrix is a polynomial).The first method allows one to extend any state-of-the-art dictionary learning method to the polynomial case;and the second method allows one to directly process the polynomial matrix without having to access its coeff-icient matrices.A sparse representation method is also presented for reconstructing polynomial "signals" based on a polynomial dictionary.Simulations are provided to demonstrate the performance of the proposed algorithms,e.g.,for RIR modeling,and for polynomial "signal" reconstruction from noisy measurements.Finally,we study the blind speech deconvolution problem by using the polynomial dictionary learning technique.As the previous proposed sparse method and joint norm method all have applicable limitations,we present a blind speech deconvolution mod-el with polynomial dictionary and sparse representation,where a pre-trained polynomial dictionary is used to provide the prior information for sparsely representing the acoustic RIR.To address the problem,an alternating optimization method is proposed to estimate the source speech and RIR.Simulations are provided to demonstrate the performance of the proposed method,as compared with previous proposed methods.
Keywords/Search Tags:blind speech deconvolution, dictionary learning, sparse representation, polynomial matrix, room impulse response, acoustic channel modeling
PDF Full Text Request
Related items