Study On Speech Quality Assessment Based On Deep Learning

Posted on:2022-10-09

Degree:Master

Type:Thesis

Country:China

Candidate:M M Qin

Full Text:PDF

GTID:2518306509477484

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Speech quality assessment technology is one of the significant researches in the speech processing community,and it has a wide range of applications in mobile communication,the Internet,consumer electronics,digital entertainment,public safety and other fields.Subjective speech quality assessment usually needs to spend a lot of human and material resources,and it is time-consuming.Therefore,the objective speech quality assessment method is becoming increasingly popular.The intrusive method requires pure original speech,which may be difficult to obtain in practice.Therefore,objective non-intrusive speech quality assessment has gradually gained attention,especially in recent years,the research of non-intrusive speech quality assessment based on deep learning has made important progress.However,the speech quality assessment method based on deep learning has a large amount of parameters,and the evaluation accuracy needs to be improved.To address the above-mentioned problems,this thesis studies an objective non-intrusive speech quality assessment method based on deep learning.The main work contents are as follows:(1)A speech quality assessment method based on the attention mechanism and convolutional recurrent network is proposed.Convolutional neural network and bidirectional long short-term memory(Bi LSTM)network are combined to form the CBLSTM network,which makes full use of the ability of CNN to capture the spatial information of the local receptive fields and the ability of Bi LSTM to effectively memorize the contextual information of the sequence.On this basis,the Squeeze-and-Excitation(SE)module is added to the CBLSTM network,which calibrates the feature map by learning the correlation between different channels in the feature map to get the importance of different channels.In addition,a custom loss function based on signal to distort ratio(SDR)is proposed for model fitting,which improves the evaluation performance of the model.The experimental results verify the validity of the proposed method.(2)By combining efficient channel attention(ECA)module,a speech quality assessment method based on the improved convolutional neural network and bidirectional gate recurrent unit(Bi GRU)is proposed.First,to reducing the amount of parameters and calculations,Bi GRU and depthwise separable convolution are combined.Then,the main structure of the residual network(Res Net)is used to optimize the convolution part,which directly maps the shallow feature information to the deep layer to improve the speech quality assessment performance.On this basis,the SE module and the ECA module are added to the model to effectively filtrate the input information and further improve the performance.The experimental results show that this method can achieve good evaluation performance with a small amount of parameters.

Keywords/Search Tags:

Speech Quality Assessment, Attention Mechanism, Deep Learning, Depthwise Separable Convolution, Bidirectional Gate Recurrent Unit

PDF Full Text Request

Related items

1	Speech Emotion Recognition Based On Deep Learning
2	Research On Sound Source Localization Based On SELDnet
3	Research On Communication Signal Modulation Recognition Based On Deep Learning
4	Application Of Improved Deep Learning Algorithm In Chinese Text Classification
5	Research On The Grab Detection Algorithm Of Portal Crane Based On Deep Learning
6	Research On Visual Action Recognition Based On Deep Learning
7	Research On Hierarchical Text Emotional Classification Based On Deep Learning
8	Study On Chinese Speech Synthesis Methods Based On Deep Learning
9	Research On End-to-End Speech Recognition Based On GRU And Self-Attention Mechanism
10	Database Construction And Algorithm Research Of Visual Speech Recognition Based On Deep Learning