Research On End-to-end Speech Enhancement Algorithm Based On Attention Joint Convolutional Network

Posted on:2022-10-12

Degree:Master

Type:Thesis

Country:China

Candidate:X Feng

Full Text:PDF

GTID:2518306560492854

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The purpose of speech enhancement is to remove various interference noises in noisy speech by designing an efficient signal processing algorithm,and restore a clean enhanced speech,while ensuring that the enhanced speech has a higher recovery quality and intelligibility.Traditional speech enhancement algorithms need to make strict assumptions about speech and noise signals before they are used,which limits their application in realworld scene.In recent years,neural networks that do not require any assumptions and have strong data modeling capabilities have received extensive attention from researchers and have become the mainstream algorithms in this field.This article focuses on improving the global modeling level and speech enhancement capabilities of convolutional neural networks.Convolution operations are good at paying attention to the partial details of the input speech signal,but its receptive field is very limited and it is difficult to capture global information.Therefore,it is necessary to stack multiple layers to learn the context dependence of the speech signal.However,as the number of layers deepens,the network will generate a lot of redundant information,which is not conducive to network learning after being transmitted layer by layer.In order to solve the above problems,this article combines three different types of self-attention mechanisms with convolutional neural networks to help the network obtain the global information of the speech signal from multiple angles,focus on effective features,and suppress redundant features.The specific research content is as follows:(1)The thesis takes Wave-Unet convolutional neural network as the basic structure,combines the Stand-alone full attention layer with Wave-Unet,and proposes a new speech enhancement model Wave-sa-Unet.The output speech feature map of the CLP layer is sent to the Stand-alone full attention layer for pixel focusing and feature reconstruction,which helps the model pay attention to useful information and suppress redundant information.The thesis adopts an end-to-end speech enhancement framework,and through a reasonable design of the network structure,the complex speech feature extraction process is omitted,and the noisy speech signal is directly sent to the model for training,and the enhanced speech waveform is output.At the same time,the thesis takes the scale-invariant signal-to-noise ratio as the model's objective function,and directly optimizes the speech evaluation index to improve the speech enhancement capability of the network model.Experimental results show that,compared with the Wave-Unet baseline model,Wave-sa-Unet produces a scale-invariant signal-to-noise ratio gain of0.54 d B.(2)By introducing two self-attention mechanisms(non-local module and channel squeeze-excitation mechanism)to Wave-sa-Unet,the thesis proposes a multi-attention joint convolution speech enhancement model Wave-ma-Unet,Three self-attention mechanisms assist and calibrate the convolutional network from different angles to help the network further improve the denoising level and speech enhancement capabilities.Experimental results show that,Wave-ma-Unet produces a scale-invariant signal-to-noise ratio gain of 0.66 d B than Wave-sa-Unet.

Keywords/Search Tags:

Speech enhancement, Convolutional Neural Network, Self-attention mechanism

PDF Full Text Request

Related items

1	Research On End-to-end Speech Enhancement Algorithm Based On Attention Joint Convolutional Network
2	Study On Speech Enhancement Based On Deep Learning
3	Research On Enhancement Algorithms Of Low Illumination Images Based On Convolutional Neural Networks
4	Research On Tibetan Speech Enhancement Method Based On Neural Network
5	Research On Speech Enhancement Method Based On Parallel Optimize Recurrent Neural Network
6	Research On Image Caption Based On Attention Mechanism
7	Research On Single Channel Speech Enhancement Based On Multi-head Attention Mechanism
8	Research On Monophonic Speech Enhancement Algorithm Based On Attention Mechanism
9	Research On Speech Enhancement Algorithm Based On Convolutional Neural Network
10	Research On Speech Enhancement Algorithms Based On Deep Learning