Font Size: a A A

Research On Speech Separation Algorithm Based On Traditional Method And Deep Learning Method

Posted on:2022-09-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2518306314468564Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Speech signals are the most commonly used communication signals in life,which contain a lot of language and emotional information.With the development of technology,it has been widely used in various fields such as intelligent control,biomedicine and electronic finance.Speech signal separation technology is the basis of speech recognition and speech enhancement technologies.How to detect specific speech in the observation signal in a complex environment Accurate separation has important research value.There are two main ideas for speech separation technology.One is a non-deep learning method based on signal processing,which can also be called a traditional algorithm,and the other is a separation operation based on deep learning.In this thesis,a GA?FastICA algorithm is proposed for the problem of derdetermined blind source separation in traditional algorithms.The observation signal is processed by the GA algorithm in the first stage of noise reduction,and then combined with the FastICA algorithm to complete the speech signal separation operation.Experimental results show that under the condition of lower signal-to-noise ratio,or under the influence of different types of noise,the algorithm proposed in this paper has a better separation effect than the original algorithm.Aiming at the problem of single-channel blind source separation in traditional algorithms,this paper proposes a hyperplane decomposition method based on NMF,in which the original mixed matrix is expressed by a base matrix and a coefficient matrix.Each column in the mixed matrix can be calculated from the basis matrix and the coefficient matrix,and to map it to the geometric level is to project the sample set on its basis vector subspace.In the experiment,8 and 16 hyperplane separation effects were made,and the relationship between the reconstruction effect and the number of hyperplanes was explored,which provided a new idea for the single-channel speech separation algorithm.Aiming at the deep learning speech separation algorithm,this paper proposes an LSTM network model,combined with IBM to train the input speech signal,solves the problem of RNN network gradient descent,and completes the separation of singing voice signal and background music signal.In order to solve the problem of human speech separation,this paper combines the beamforming algorithm and the LSTM network to propose a beamforming LSTM algorithm.The super-directional beamforming algorithm is used to obtain beams in three different directions,and the spectral amplitude characteristics in each beam are extracted and constructed The neural network predicts the masking value,obtains the frequency spectrum of the speech signal to be separated and reconstructs the time domain signal,and then realizes the speech separation.The algorithm makes full use of the spatial characteristics of the speech signal and the signal frequency domain characteristics,and uses PESQ,STOI,and SDR to evaluate the separation results.The results show that the algorithm proposed in this paper has improved various indicators compared with the LSTM algorithm,and the speaker separation effect is better.
Keywords/Search Tags:Speech separation, GA?FastICA, Hyperplane decomposition, Beam forming algorithm, Long and short-term memory network model
PDF Full Text Request
Related items