Font Size: a A A

Research On Two Methods Of Single Channel Speech Separation

Posted on:2020-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:X L DongFull Text:PDF
GTID:2428330590954688Subject:Engineering
Abstract/Summary:PDF Full Text Request
Artificial intelligence technology constantly updates and iterates,and permeates to the increasingly rich application scene,man-machine voice interaction technology is becoming more and more indispensable.However,the external environment is changeable,the noise interference will often seriously affect the performance of speech interaction,especially the strong noise,single channel condition,thus hindering the real application of speech technology,so a good front-end speech separation module is particularly important.In recent years,supervised speech separation technology has made important progress,among which the mainstream supervised learning algorithms include computational auditory scene analysis,non-negative matrix decomposition and deep neural network-based speech separation algorithms.This paper mainly studies supervised speech separation algorithms based on non-negative matrix decomposition and neural network.The main contents and innovations are as follows:Firstly,the speech separation method based on non-negative matrix decomposition is deeply studied and implemented in this paper,and the existing models are improved and optimized.A strong noise mono-channel speech separation algorithm based on convolutional non-negative matrix partial joint decomposition is proposed.The speech starting point of the mixed signal is obtained by pitch detection algorithm,and then the pure noise segment in the mixed signal is determined.Finally,the mixed signal spectrum and the noise spectrum are decomposed partly by convolutional nonnegative matrix,and the speech base matrix is obtained.Then the separated speech spectrum and time domain signal are obtained.The experimental results show that under the conditions of different noise types and noise intensity,the speech separation of the convolutional non-negative matrix partial joint decomposition has achieved better results.Secondly,this paper studies the supervised speech separation algorithm and network framework based on depth clustering,and then proposes a speech separation method based on threshold convolution depth clustering.It makes full use of the strong feature learning ability of convolution neural network with multi-level nonlinear structure,and is good at exploring the advantages of space-time structure information in speech time-frequency unit.The algorithm allows the context feature modeling of speech spectrum,and considers the time-frequency dependence and local characteristics of speech signal,which is beneficial to improve the performance of speech separation.The experimental results show that the method not only achieves good separation effect,but also improves the operation speed significantly on the premise of ensuring speech performance.Finally,this paper summarizes the research and points out the future research direction.
Keywords/Search Tags:Convolutive nonnegative matrix partial co-factorization, speech separation, low SNR, monaural speech, deep clustering
PDF Full Text Request
Related items