Font Size: a A A

Advancements in Statistical Signal Processing and Machine Learning for Speech Enhancemen

Posted on:2018-06-04Degree:Ph.DType:Dissertation
University:The University of Texas at DallasCandidate:Xing, HuaFull Text:PDF
GTID:1478390020456131Subject:Electrical engineering
Abstract/Summary:
This dissertation focuses on advancing the statistical signal processing and machine learning technologies in speech enhancement tasks. Statistical signal processing and machine learning are the most important and widely used technologies in investigations of speech areas. Although both of the two group technologies have achieved significant success in speech enhancement tasks, they are usually applied independently in the existing solutions. In this dissertation we verified the idea that the combination of techniques from the two domains could further boost the effectiveness in enhancement performance by applying them corporately in two speech enhancement problems.;In the first investigated problem, estimating frequency shift estimation in single sideband speech, the existing methods, exclusively based on signal processing, suffered the non-uniqueness issue caused by the periodic characteristic of the voiced speech. To address this issue, a pre-step of uniqueness interval detection is proposed based on the analysis of origins of estimation errors. Three machine learning techniques, GMM-SVM, i-Vector and stacked Autoencoder, are adopted to detect the uniqueness interval in the rst step. A unique feature specially designed to represent the frequency shift property is developed as the input of classifiers . Experimental results veri es the effect of introducing machine learning techniques to the existing solution.;In the second investigated problem, speech denoising, the linear spectral pair (LSP), an efficient representation for speech which encodes the information of spectral formant location and bandwidth, was adopted as the input of DNN regression architecture along with the conventional log-spectra features. To capture the dynamic property in speech signal, the rst and second derivatives of LSPs are also used. In addition, Auto-LSP, an efficient iterative denoising algorithm is applied as a post-processing to further promote the enhancing result. The effectiveness of the proposed feature and post-processing operation was con rmed by the denoising results in terms of three objective criteria.;Collectively, the contributions made in these two speech enhancement topics supported the idea that the advantages of signal processing and machine learning could compliment each other to improve the performance of the overall enhancement techniques.
Keywords/Search Tags:Machine learning, Speech, Enhancement, Techniques
Related items