Font Size: a A A

Design And Implementation Of Single-channel Realtime Speech Enhancement System Based On Deep Learning

Posted on:2021-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:C L XuFull Text:PDF
GTID:2518306569497894Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
As an important part of speech processing and application technology,speech enhancement system stays more and more important not only in traditional fields like hearing aids and voice communication but in the emerging voice control field.Microphone array speech enhancement technology,usually using a big microphone array,relatively has a higher algorithm calculation and complexity.Currently,it's mainly used in some specific cases.For most case,we not only hope that the space occupied by the pickup device could be as small as possible,but also the calculation and complexity of the algorithm could meet the realtime requirements of speech enhancement.Single channel speech enhancement technology is a better choice for these requarements.However,the current single channel speech enhancement system is either insufficient in noise suppression,or the system is too complex to deployed on mobile devices.In order to solve these problems,this paper designs a single channel realtime speech enhancement system combining deep learning and traditional speech enhancement algorithm,and finally completes the realization of the entire system on an embedded hardware platform.On the level of system's algorithm,aiming at the problem that the traditional speech enhancement algorithm estimates the prior SNR unaccurately,and the pure deep learning speech enhancement algorithm model has too many parameters to be applied in realtime.Based on the Deep Xi-TCN prior SNR estimation framework,this article selects an appropriate traditional spectrum estimator and constructs a complementary speech enhancement algorithm model.In order to improve the generalization ability of the system,this paper selects a large Chinese speech dataset and a large background noise dataset for model training,and designs a comparative experiment to evaluate the speech enhancement effect of the algorithm model.The experimental results show that the algorithm model has a good performance in noise suppression,no matter how complex the condition is.Aiming at dealing the key problem of transferring the nonrealtime speech enhancement algorithm into realtime speech enhancement system,this article proposes a model of realtime single channel speech enhancement system based on temporal convolution neural network.Buffers are set in the system to store the passing speech information that needs to be read and written by each temporal convolution network,which transforms the capturing mode of the system from nonrealtime mode to realtime mode.A buffers' read-write strategy is designed to improve the data accessing efficiency.By updating the read and write address instead of first in first out mode,it transforms the frequent data movement operation into index loop increment.In addition,In addition,the time frequency conversion module and spectrum estimator module are designed,which effectively improved the system performance.Finally,the construction of realtime single channel speech enhancement system is completed.The experiment result shows that the realtime system is consistent with the original algorithm model trained in TensorFlow.In order to realize the single channel speech enhancement system and verify the its practical performance in application,this article designs a realtime single channel speech enhancement system based on the iTop-4412 embedded hardware platform,combining an open-source application programming interface tinyalsa and the realtime single channel speech enhancement system.When running,it's found that the system is too slow to normally use.Combined with the features of the platform,the system is optimized and accelerated,and the acceleration effect of each module is verified.The experimental results show that the speech enhancement system can record,enhance and play in realtime on the iTop-4412 embedded hardware platform.In addition,the system implemented on the hardware platform has been tested for speech enhancement performance and the experiment result shows that the speech enhancement results are consistent between the system of being implemented before and after,and it has a strong adaptability in practical application scenarios.
Keywords/Search Tags:speech enhancement, temporal convolutional network, priori SNR, deep learning, embedded
PDF Full Text Request
Related items