Font Size: a A A

Speech Compensation Algorithm Based On Deep Learning In VoIP Communication

Posted on:2021-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:W C RongFull Text:PDF
GTID:2518306197499914Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of smart homes,remote Man-Machine voice interaction over IP networks has become a hot topic in industry and academia,which has put forward higher requirements on the quality of communication voice.Voice communications over IP networks,i.e.,Vo IP,often suffer from problems such as packet loss and bandwidth limitations,and,as a result,the deteriorated voice quality could seriously degrade the performance of voice interaction systems.Therefore,it is of practical value to study the speech compensation algorithm and to improve the speech quality in Vo IP.As a real-time audio codec,Opus is widely adopted in Vo IP communication due to its functional diversity.Opus,at its decoding end,provides speech reconstruction based on an autoregressive model to alleviate the packet loss problem in the IP network.Although this can improve the voice quality to a certain extent,the performance is poor in the case of continuous frame loss,and its voice compensation scheme does not provide bandwidth expansion for narrowband speech.In this thesis,two deep learning-based voice compensation algorithms,namely packet loss concealment(PLC)and bandwidth expansion(BWE),are proposed for the Opus codec.The PLC uses a deep learning model to predict the line spectral frequency(LSF),which,together with other parameters,are then used to reconstruct the lost frames.The BWE uses a generative adversarial network(GAN)to estimate the high-frequency components of narrow-band speech.To evaluate the performance of the proposed PLC and BWE algorithms in practical applications,an Opus codec incorporating both PLC and BWE,namely the Opus?NN,is developed for experimental verification.Four widely used objective evaluation metrics are adopted for the evaluations.For PLC,the performance of Opus?NN,in comparison with Opus,is evaluated at five different packet loss rates,each with 4 different signal-to-noise ratio(SNR)conditions.For BWE,the performance of the proposed GAN-based algorithm,in comparison with the DNN-based one,is evaluated with noise-free speeches at five different bandwidth cutting conditions.Furthermore,to evaluate the impact of the proposed system on speech recognition performance,voices with different packet loss rates and 10% packet loss rate plus different bandwidth cutting,all being noise-free,are used to test the compensation performance of Opus and Opus?NN.Results show that the speech compensation algorithms proposed in this study can effectively alleviate the speech quality degradation caused by continuous frame loss with little impact on Opus decoding efficiency,and achieve better performance on the expansion of speech bandwidth.
Keywords/Search Tags:VoIP, intelligent voice interaction, Opus, deep learning, packet loss compensation, bandwidth expansion
PDF Full Text Request
Related items