Font Size: a A A

Study On Speech Quality Assessment Based On Deep Learning

Posted on:2022-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:M M QinFull Text:PDF
GTID:2518306509477484Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Speech quality assessment technology is one of the significant researches in the speech processing community,and it has a wide range of applications in mobile communication,the Internet,consumer electronics,digital entertainment,public safety and other fields.Subjective speech quality assessment usually needs to spend a lot of human and material resources,and it is time-consuming.Therefore,the objective speech quality assessment method is becoming increasingly popular.The intrusive method requires pure original speech,which may be difficult to obtain in practice.Therefore,objective non-intrusive speech quality assessment has gradually gained attention,especially in recent years,the research of non-intrusive speech quality assessment based on deep learning has made important progress.However,the speech quality assessment method based on deep learning has a large amount of parameters,and the evaluation accuracy needs to be improved.To address the above-mentioned problems,this thesis studies an objective non-intrusive speech quality assessment method based on deep learning.The main work contents are as follows:(1)A speech quality assessment method based on the attention mechanism and convolutional recurrent network is proposed.Convolutional neural network and bidirectional long short-term memory(Bi LSTM)network are combined to form the CBLSTM network,which makes full use of the ability of CNN to capture the spatial information of the local receptive fields and the ability of Bi LSTM to effectively memorize the contextual information of the sequence.On this basis,the Squeeze-and-Excitation(SE)module is added to the CBLSTM network,which calibrates the feature map by learning the correlation between different channels in the feature map to get the importance of different channels.In addition,a custom loss function based on signal to distort ratio(SDR)is proposed for model fitting,which improves the evaluation performance of the model.The experimental results verify the validity of the proposed method.(2)By combining efficient channel attention(ECA)module,a speech quality assessment method based on the improved convolutional neural network and bidirectional gate recurrent unit(Bi GRU)is proposed.First,to reducing the amount of parameters and calculations,Bi GRU and depthwise separable convolution are combined.Then,the main structure of the residual network(Res Net)is used to optimize the convolution part,which directly maps the shallow feature information to the deep layer to improve the speech quality assessment performance.On this basis,the SE module and the ECA module are added to the model to effectively filtrate the input information and further improve the performance.The experimental results show that this method can achieve good evaluation performance with a small amount of parameters.
Keywords/Search Tags:Speech Quality Assessment, Attention Mechanism, Deep Learning, Depthwise Separable Convolution, Bidirectional Gate Recurrent Unit
PDF Full Text Request
Related items