Research Of Monaural Speech Enhancement Based On Quality Assessment And Deep Neural Network

Posted on:2023-07-09

Degree:Master

Type:Thesis

Country:China

Candidate:D Tian

Full Text:PDF

GTID:2558307097489314

Subject:Mechanics (Professional Degree)

Abstract/Summary:

Speech enhancement(SE)technology has developed rapidly in the past decades.From the initial traditional unsupervised method to the deep learning method.As an important front-end system in the field of speech,SE has been used in speech communication,hearing assistance,speech recognition,video conferencing and other scenarios.However,for complex application scenarios,Improving the Perceptual Evaluation of Speech Quality(PESQ)and Short Time Objective Intelligibility(STOI)of signals as much as possible is still the goal that needs to be continuously explored in current SE research.Therefore,this paper studies the widely used SE technology in the following aspects:(1)In view of the fact that the existing speech enhancement system does not screen the received signal,it is all enhanced by default,which leads to the weakening of PESQ and STOI after the speech signal with high signal-to-noise ratio(SNR)is processed by SE.Therefore,this paper innovatively proposes a selective speech enhancement method based on quality assessment.The Non-intrusive Speech Quality Assessment(NISQA)algorithm without reference source is used to assessment the quality of the speech signals before and after enhancement.By comparing the speech quality scores before and after en-hancement,the received signals are screened to avoid the weaken-ing of PESQ and STOI caused by redundant enhancement.(2)Under different noise types and noise levels.At the deep neural network(DNN)with ideal binary masking(IBM)as the target.The influence of the five most widely used features mel-frequency cepstral coeffificients(MFCC),gammatone frequency cepstral coeffificients(GFCC),relative spectral trans-formed perceptual linear prediction coeffi-cients(RASTA-PLP),amplitude modulation spectrogram(AMS),and multi-resolution cochleogram(MRCG)on enhanced signals PESQ and STOI under different noise types and levels are explored.The results show that for the signal under any background noise,the features that make the best PESQ and STOI scores are related to the background noise type,SNR level,and whether the noise matches.The results have important reference significance for researchers in related fields.(3)For SE tasks that focus more on PESQ metrics,according to the conclusion that the best feature of the speech signal PESQ score in the experiment of point(2)has a complex relationship with the background noise.Since the quality assessment can excellently avoid the noise type and SNR identification.A SE method based on quality assessment and DNN feature selection is innovatively proposed.Compared with SE systems with single-feature extraction,the quality-based approach selects the feature that gives the best PESQ score.It consistently maintained the desired enhancement in PESQ metrics.

Keywords/Search Tags:

Quality assessment, Deep neural network, Selective enhancement, Monaural speech enhancement, Speech features

Related items

1	Research On Deep Learning Based Monaural Speech Enhancement
2	Research On Speech Enhancement Based On Speech Modeling And Speech Quality Assessment
3	Compression Method Of Deep Neural Network Model For Speech Enhancement
4	Speech Enhancement Based On Deep Neural Network And Recurrent Neural Network
5	Research And Implementation Of Single-channel Speech Enhancement Based On Deep Neural Network
6	Research On Deep Learning Based Speech Enhancement
7	Research On Monaural Speech Enhancement Based On Improved CRN
8	Research On Speech Enhancement Algorithm Based On Deep Neural Network
9	Research On Monaural Speech Enhancement Based On DNN And Phase Spectrum Compensation
10	Research On Monaural Speech Enhancement Based On Prior Information In Different Semantic Levels