Research On Speech Enhancement Under Non-Stationary Noise

Posted on:2023-07-13

Degree:Master

Type:Thesis

Country:China

Candidate:Z S Chen

Full Text:PDF

GTID:2558306914963799

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of communication technology,people have higher and higher requirements for voice quality in the communication process.Considering the objective fact that the background noises change all the time,it is of great significance to study speech enhancement in nonstationary noise scenarios for solving practical problems.Since traditional speech enhancement methods are based on the assumption that noise is stationary,their ability to solve practical problems is limited.In recent years,speech enhancement is mainly implemented by deep learning methods.The existing speech enhancement algorithms have the following problems:(1)The loss function is too ideal and lacks consideration of the characteristics of the human ear perception of sound;(2)In the case of low signal-to-noise ratio in the real environment,due to the lack of timefrequency domain information fusion,speech enhancement is not good.In response to the above problems,this paper proposes two researches to improve the performance of speech enhancement in non-stationary noise scenarios,and designs and implements a real-time conference system with ultra-clear sound quality.Firstly,propose speech enhancement based on perceptual loss function.The loss functions of existing speech enhancement algorithms are mostly ideal,resulting in poor final speech enhancement results in non-stationary noise scenarios.This paper fully considers the characteristics of the human ear perception of sound,introduces the perceptual loss function to extract the characteristics of the speech signal,and compares the deviation between the enhanced signal and the original pure signal in a more detailed and comprehensive manner,which can make this algorithm show good results in non-stationary noise scenes.Secondly,propose adaptive speech enhancement based on SNR supervision.In a low signal-to-noise ratio environment,the enhanced speech signal is not good because the speech information used is too single.This paper proposes to introduce the frequency-domain loss function to train the time-domain network,fully integrate the speech information in the time-frequency domain,analyze the advantages and disadvantages of the time-frequency domain loss,and implement the loss adaptive strategy according to the attention mechanism.Thus,the generalization ability of the model in complex environment can be improved.Thirdly,design and implementation of a real-time conference system with ultra-clear sound quality.The speech enhancement algorithms of existing conference systems have limited capabilities,resulting in poor speech quality during communication.This paper designs and implements a client that can run smoothly on Mac and Windows platforms,and applies the speech enhancement algorithm to improve voice quality during communication.

Keywords/Search Tags:

speech enhancement, perceptual loss function, attention mechanism, deep learning

PDF Full Text Request

Related items

1	Research And Implementation Of Single-channel Speech Enhancement Model Based On Deep Learning
2	Research On Speech Enhancement Algorithm Based On Deep Learning
3	Speech Enhancement Based On Perceptual Multilevel Mixed Attention Skip Connection
4	Improved Algorithm For Speech Enhancement Based On Deep Learning
5	Speech Enhancement Of Deep Neural Networks Combined With Attention Mechanism
6	Research On Speech Bandwidth Extension Based On Deep Learning
7	Research On Monaural Speech Separation Technology Based On Deep Learning Multiple Constraint And Channel Attention Mechanism
8	The Design And Application Of Perceptual Loss Function In Deep Learning
9	Study On Speech Enhancement Based On Deep Learning
10	Research On Speech Enhancement Technology Based On Deep Learning