Font Size: a A A

Research On Coding Algorithm Based On GMM Speech Spectral Envelope Representation

Posted on:2018-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:R R WangFull Text:PDF
GTID:2358330518492661Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Low bit-rate speech coding and ultra low bit-rate speech coding are important direction for the research and development of modern speech coding technology.Reducing the bit rate is the goal and motivation for the continuous development of speech coding technology.Voice activity detection is an important part of speech coding. On the basis of the study about voice activity detection algorithm based on spectrum variance method,this paper proposed a new voice activity detection algorithm based on ERB scale to divide band. The experimental results show that the VAD algorithm based on ERB scale has corresponding improvement in detection accuracy rate, false detection rate and the missed rate compared with the basic variance VAD algorithm and the variance VAD algorithm based on Bark scale.A new segment coding algorithm is proposed on the basis of the study about low bit rate speech coding based on Gaussian Mixture Model (GMM) for parametric representation of speech spectrum envelope. In the algorithm, several frames are collected into a segment after using GMM to parameterize the short-time speech spectrum envelope. The polynomial trajectory is used to fit the parameters of Gaussian mixture model in a segment, thus reduced the number of parameters. The results show that the bit rate of the new vocoder is obviously reduced in contrast to the basic vocoder based on GMM.Additionally a new vocoder is proposed to improve low bit-rate speech coding based on improved Gaussian Mixture Model (iGMM). In the vocoder, several frames of GMM parameters consist of a super-frame, then joint coding of multi-frames, thus improved the GMM model. The new iGMM model is used to fit the speech spectrum envelope. The results show that the bit rate of iGMM vocoder is reduced meanwhile it can get a acceptable decoding speech at 0.86 kb/s.
Keywords/Search Tags:Speech coding, Voice Activity Detection, Gaussian Mixture Model, polynomial fitting, Vandermonde matrix
PDF Full Text Request
Related items