Font Size: a A A

Research And FPGA Implementation Of Speech Enhancement Algorithm Based On CNN

Posted on:2022-09-09Degree:MasterType:Thesis
Country:ChinaCandidate:L WeiFull Text:PDF
GTID:2518306524984489Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Being a part of speech processing system,speech enhancement works as a front-end system by improve the quality of the source data of the Post-processing system,and the performance of the whole system therefore becomes better.In the meantime,as an independent system,this technology helps a lot in improving the listening sensation of impaired speech and enhance the listening experience as well.Therefore,these characteristics lead to its broad application in the fields like mobile communication,military communication,human-computer interaction.The demands for different scenarios that data processing platform works in is becoming complicated and diversified with the development of Io T related technologies.Embedded platform,characterized by its terminal,low power consumption,can well adapt to the current trend of marginalization of computing.For non-stationary noise,the performance of traditional speech enhancement algorithms can no longer match the algorithms based on deep learning.But to implement on an embedded platform,the algorithms based on deep learning are trapped by its complex structure,huge number of trainable parameters.The main work contents of this thesis are as follows:1.A convolutional neural network based on encoder-decoder architecture,which has shown its outstanding ability in image denoising tasks,is selected to be applied to speech enhancement tasks.By comparing and analyzing the structure of two typical speech enhancement network models and their speech enhancement effects,a speech enhancement model suitable for embedded platform is finally determined,and it is retrained,using data sets containing more speakers and noise types.2.According to the analysis of parallelism,a "parallel data path + control logic +cache on a chip" structure,which can temporarily store structured data,is designed for convolutional acceleration.At the same time,by combining the weight and the coefficient of BN operation,the process of BN operation alone is saved.3.The speech enhancement system is designed from speech acquisition to speech feature extraction and then to speech enhancement and finally implemented on Xilinx Zynq 7020 So C.The quantitative inference results of the designed system are analyzed based on several speech quality evaluation methods,and by comparing to the results of the floating-point algorithm,the quantization error is calculated and evaluated.The work of this thesis focuses on a purpose to optimize the implementation of the speech enhancement algorithm on an embedded platform.During the design,the resources and the architecture of this chips are Comprehensively considerated and the accuracy and system delay are carefully weighed.Finally,the time of obtaining the enhanced spectrum of a single frame(129 data points)in the convolutional network part is only 0.0016s(100MHz clock),and the total power of the system on chip is only 2.129 W.
Keywords/Search Tags:Speech enhancement, Zynq SoC, Convolutional Neural Network, Parallelization acceleration
PDF Full Text Request
Related items