Font Size: a A A

Optimization On Machine Learning Model For Speech Enhancement Based On Subjective Auditory Feedback

Posted on:2020-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:F Q YeFull Text:PDF
GTID:2428330590473762Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
As one of the important methods in human communication,speech signal is susceptible to various noise interferences during transmission.This situation is particularly challenging in modern society wherein electronic communication is booming and noise poses great influence on the intelligibility of speech signals in long-distance transmission.Speech enhancement has become an indispensable part of modern speech signal processing.Nowadays,the performance of speech enhancement combined with machine learning has been greatly improved,especially for speech enhancement with deep neural network.However,because the network model has large storage requirement and high computational complexity and power consumption,it is difficult to be deployed on mobile devices and embedded systems.Therefore,this study aimed at compressing the neural network model targeted for speech enhancement and reducing the parameters to minimize the model redundancy.In this study,deep denoising autoencoder was used to construct a speech enhancement model.Based on the amplitude pruning method,an iterative pruning method combined with retraining compression was proposed.Compared with two commonly used methods,i.e.,iterative pruning without retraining,and direct pruning and retraining,the importance of retraining and asymptotic pruning was demonstrated.In addition,this study optimized the iterative pruning method by repeating the iterative pruning and retraining,which made the sparse network model reconverged.More importantly,in the model compression process,word correct rate(WCR)was used as subjective auditory feedback to evaluate the speech enhancement performance of each compression model.Finally,a curve correlating model parameter trimming ratio and WCR was fitted to determine the maximum compression threshold of each model compression method.This dissertation first introduced the single-channel speech enhancement methods,especially the neural network based speech enhancement method,and then introduced the neural network model compression methods.Finally,MATLAB codes were designed to build the neural network based speech enhancement model,and subjective listening experiment was conducted to compare the performance of each compression method.The iterative pruning and retraining model compression method could reduce 50% model parameters without significantly affecting the speech enhancement performance,and under the condition of optimized multiple iterations,a maximum pruning ratio of 80% could be achieved,equivalent to a compression rate 5:1.
Keywords/Search Tags:Speech Enhancement, Neural Network, Model Optimization, Pruning Parameters, Subjective Auditory Feedback
PDF Full Text Request
Related items