Speech-Music Discrimination For Hybrid Coder

Posted on:2019-05-31

Degree:Master

Type:Thesis

Country:China

Candidate:W Z Yang

Full Text:PDF

GTID:2428330545986970

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

PDF Full Text Request

The hybrid coder is able to choose different coding mode according to the type of input signal and obtain the best coding quality.As for hybrid coder,accuracy of dis-crimination from speech and music is one of the decisive factor.AMR-WB+(Extended Adaptive Multi-Rate-Wideband codec)and EVS(Codec for Enhanced Voice Service)which come from 3GPP are typical representative of hy-brid codec.AMR-WB+ has two coding method,closed-loop method and open-loop method.The former has higher classification accuracy than the latter but the complex-ity is also higher.In comparison,the latter which is open-loop method is more excel-lent in terms of complexity but the accuracy is unsatisfactory.However,EVS has no closed-loop method and get lower complexity,but the classification algorithm based on Gaussian Mixed Model can be significantly improved.Aiming at problems above,this paper considers the neural network and the time correlation between audio frames and comes up with Recurrent Neural Network(RNN)for speech and music discrimination.The main contributions of this paper are listed below.(1)RNN classifier for AMR-WB+The RNN classifier gets features from AMR-WB+ itself and labels the training data based on the output of closed-loop method.The goal is to design a classifier in open-loop method and use it to match the output of closed-loop method.In order to satisfy this need,this paper has designed a classifier based on RNN,solved the prob-lem of unbalanced training data,and controlled the outcome of RNN network from the angle of maximizing the coding quality.The experiment shows that the proposed method has similar complexity with open-loop method and comparable accuracy of closed-loop method.The accuracy has been improved by about 20%and the subjec-tive quality is also quite close to closed-loop method.(2)RNN classifier for EVSEVS codec cannot use the strategy like closed-loop method in AMR-WB+ to la-bel the training data,but can only rely on the subjective judgment.Hence the datasets must be pure enough.This paper chooses speech and music data from professional audio datasets to make up the training data and test data and obtains features from EVS codec for RNN training.The experiment shows that accuracy of both music and speech signal has been improved,especially for the music signal.The contributions in this paper have important significance for improving the performance of hybrid codec.

Keywords/Search Tags:

Hybrid coder, coding mode selection, signal discrimination, recurrent neural network

PDF Full Text Request

Related items

1	An Image Coding Combines Block Prediction Intra Frame Of H.264 Standard And The Specific Hybrid Coder
2	Modeling For Nonlinear System Based On Hybird Neural Network
3	Question Classification Based On Deep Learning Model
4	Research On Hybrid Network Traffic Prediction Model Based On Mode Decomposition And Neural Networks
5	Research On Emotional Tendency Classification Based On Online Video Website Reviews
6	Mode Selection And Interference Coordination In The Hybrid D2D And Cellular Network
7	Research On Two Kinds Of Aliasing Signals Translation Based On Recurrent Neural Network
8	Recurrent Neural Network Training For Large Signal Model Of TWT
9	Research On Deep Learning Based Antenna Selection And Signal Detection In DM-GSM System
10	Short-term Prediction Study Of Wind Power Based On Hybrid Neural Network And Multiple Signal Decomposition