Font Size: a A A

Signal Enhancement Based On Complex-valued Neural Networks

Posted on:2019-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:N J ZhengFull Text:PDF
GTID:2428330572957749Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
The speech enhancement aims to recover the clean speech from the noisy speech,which is usually processed in the time-frequency domain of the signal.The short-time fourier trans-form(STFT)can be applied to transform the speech signals from time domain into time-frequency domain,where the(complex-valued)spectrogram of the speech can be obtained.The enhancement of the spectrogram is a main way to realize the speech enhancement.How-ever,people usually focus on the magnitude enhancement of the spectrogram,while leaving the phase information far from explored.That is mainly because 1)the magnitude enhance-ment is of high-structured information and can be easily applied to recognize and suppress the noisy energy;2)the phase wrapping problem will cause the phase value randomly dis-tributed on the spectrogram,which makes it hard to estimated and reconstructed.Nowadays,the phase information has been proved to improve the speech quality efficiently,that is to say,the estimation and reconstruction of the phase has becomed an important problem in the speech enhancement.This thesis discusses how to reconstruct the phase information and enhance the complete complex-valued spectrogram information in the speech enhancement problem.The main work of this thesis includes:1)Analyze the performance of the Real-Imaginary complex-valued neural networks(RI CVNNs)in the speech enhancement problem and compare it with the real-valued neural network(RVNN).The results show that when given suitable active functions,the CVNNs can get slightly better performance than that of RVNNs;2)Analyze the benefits brought by the phase reconstruction in the magnitude and phase coor-dinate system,and then discuss how to estimate the phase information using the deep neural networks(DNNs)as well as the phase reconstruction,where multiple objective RVNNs and CVNNs are constructed for jointly estimating the magnitude and phase information.The simulation results show that when compared with methods based on only magnitude enhancement,the methods enhancing both of magnitude and phase can have much better performance.The signal to distortion ratio(SDR)scores and extended STOI(ESTOI)scores of the proposed methods are at least 0.4 and 0.02 higher than those of the comparison meth-ods respectively on the female speakers at the SNR of 0 dB,and around 0.3 and 0.01 higher respectively on the male speakers at the SNR of 0 dB.
Keywords/Search Tags:Speech enhancement, deep neural network(DNN), phase reconstruction, instaneous frequency, complex-valued neural network(CVNN)
PDF Full Text Request
Related items