Design And Implementation Of CNN-BLSTM Speech Separation Algorithm Fused With Self-attention Mechanism

Posted on:2022-07-31

Degree:Master

Type:Thesis

Country:China

Candidate:H X Zhu

Full Text:PDF

GTID:2518306737978939

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the development of computer technology,the demand for artificial intelligence is increasing day by day.The use of deep learning technology to solve the problem of time series feature recognition is the current research focus.Especially in intelligent communication,voice assistant and other aspects,voice separation technology provides a strong technical support to ensure the accuracy of obtaining information.Human voice and noise and the separation of human voice and human voice are two main research directions in the field of speech separation,and they are the basic work of speech signal processing.In this paper,the method of fusing self-attention mechanism(SACNN-BLSTM)is used to design and implement speech separation.The main work includes the following aspects:1)The time series data that meet the requirements of the experiment are obtained.First of all,the time series data set that meets the requirements of the experiment is screened,and after obtaining the time series data that meets the requirements of the experiment,it can not be used directly,so it is necessary to preprocess the data and mix pure speech with pure speech or pure speech with noise to obtain the mixed speech needed in the experiment.2)The experiment of separating human voice from noise.The speech signal with noise is separated and the clean target speech is obtained.In order to make up for the shortcomings of the CNN-BLSTM model,this paper integrates the self-attention mechanism into the CNN-BLSTM,so that the time-frequency features dominated by the target speech get more attention,and there is a more obvious distinction between the target speech and noise,so as to achieve the purpose of noise reduction for the target speech signal.The experimental results show that compared with the CNN-BLSTM model,the SDR of the separated target speech of the SACNN-BLSTM model is improved to a certain extent.3)Separate mixed human voice experiment.The main purpose of mixed voice separation is to separate the mixed speech signals of two speakers and get independent clean speech signals of two speakers respectively.In this paper,the SACNN-BLSTM model is used to model the speech signal,and the speech signal feature is used as the input,and the self-attention mechanism is applied to the high-dimensional abstract feature obtained after the Dilated CNN layer and the BLSTM layer,and the weight is given to each frame time-frequency feature,so that the time-frequency characteristics of the two speakers' respective speech signals are clearly distinguished,and the purpose of separating the mixed speech signal is achieved.The experimental results show that,compared with the CNN-BLSTM model,the speech signal separated by the SACNN-BLSTM model can improve the overall separation performance of the model without losing short-term intelligibility.Based on the above research,the design and implementation of CNN-BLSTM speech separation algorithm based on self-attention mechanism is completed.The algorithm can not only separate the human voice from the environmental noise,but also separate the mixed human voice,so as to improve the speech quality of the speaker.The test shows that the algorithm achieves the expected design goal.

Keywords/Search Tags:

Self-attention mechanism, Speech separation, Speech signal, Dilated CNN, BLSTM

PDF Full Text Request

Related items

1	Research On Monaural Speech Separation Of Specific Speaker Based On Deep Learning
2	Multi-speaker Speech Separation Based On Deep Learning
3	Research And Design Of Speech Separation Algorithm Based On Deep Learning
4	Research And Application Of Speech Signal Enhancement And Separation In Smart Home
5	Research And Implementation Of Speech Separation Technology
6	With Noise-aliasing Blind Speech Signal Separation Method
7	The Study Of Dual-channel Speech Separation Technology For Smart Mobile Devices
8	Research On Speech Signal Recognition Based On Deep Two-way GRU And Attention Mechanism
9	Research On Multi-person Speech Recognition Based On Deep Learning
10	Speech Emotion Recognition Based On Deep Learning Technology