Font Size: a A A

Research On8Kbps Speech Coding Algorithm With Delay Of2.5ms

Posted on:2013-02-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:S H WuFull Text:PDF
GTID:1118330371990765Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
ITU puts forward the target to research the toll quality speech encoding algorithm that delay less than5ms and rate less than8Kbps. In order to respond the new target, we mainly concern about the research of LD-CELP speech coding algorithm and design a toll quality speech encoding algorithm with coding rate of8Kbps and delay of2.5ms.To accomplish the above target, we study in the following four issues and have got effective progress.First, the size of frame which is5samples in G.728is extended to20samples in low delay algorithm. We introduce adaptive codebook, which is composed of latest historical excitation. The algorithm used double codebook structure which contained adaptive codebook and fixed codebook. In order to ensure the speech encoding quality and reduce the computational complexity, we configure a new algorithm structure that is different from existing algorithms. We presented and realized three algorithms with coding rate of8Kbps and delay of2.5ms.Algorithm1:Fixed codebook and adaptive codebook are both10bit (3bit for gain vector and7bit for wave codeword). The algorithm uses codebook full searching.Algorithm2:The algorithm uses discontinuous frame searching. Adaptive codebook searching only proceeds in an even frame and for sequential odd frame it adopts preceding even frame's searching result directly. The saved bits are used to enlarge size of fixed codebook.Algorithm3:The algorithm adopts the scheme integrated the backward pitch detection into the adaptive codebook searching method. For codebook searching, the algorithm detects backward pitch first and obtains pitch period T, and then searches the adaptive codebook subtly round the pitch period T in a certain range.The experiment results show that encoding quality of all three methods is very close to G.728, with delay of2.5ms and coding rate of8Kbps. The algorithm achieves desirable balance in coding rate, delay time and coding quality.Second, according to the character of codebook size and codeword dimension in8Kbps low delay speech coding algorithm, a modified self-organizing feature map (SOFM) neural network is proposed to train the fixed codebook. LBG algorithm is conventional and classical as a basic vector quantization method. But in the process of training codebook, there are always few outlier vectors. The outlier vectors influence the contribution of the codeword and reduce the performance of the compression. In order to improve the codebook performance and reduce the computation complexity, some modifications are made in codebook initialization, the searching strategy of the winner neuron and the winner codeword and topology field weight adjustment of the basic SOFM algorithm in this paper. The proposed algorithm is used to generate vector quantization codebook and the generated codebooks are used for low delay speech coding. Simulation shows that the codebook performance is improved greatly.Thirdly, we analyze the gain filter of low delay speech coding algorithm and for20samples evaluate the different predictors such as the weighted L-S recursive filter, the finite memory recursive filter and the BP neural network. And we use a new method independent of quantization signal to noise rate in evaluating gain filter's performance and deciding the filter coefficients. Thus before gain quantization we can directly compare and evaluate all kinds of optimizing schemes. In this paper, L-D method is replaced by three different methods respectively. Experiments showing, they are all better than L-D to improve gain filter performance. The weighted L-S algorithm has the best effect.At last, the most important and creative work of this dissertation is to put forward a new LPC called Auditory-Acoustic-Hybrid-LPC, which combines with both acoustic and auditory property. Now the existing algorithms almost use LPC coefficients, which are only based on acoustic model and do not reflect the auditory characterize of human auditory system. And algorithms are evaluated by PESQ after encoding. However, it hampers the improvement of speech coding quality further. We consider human's psychoauditory features fully in encoding and for the first time use them in low delay algorithm. We use MFCC's auditory feature to adjust that of LPCMCC, and then let this feature in turn feed back into the LPC coefficients, getting new LPC coefficients which reflect not only acoustic but also auditory features. Experimental results show that the new LPC can improve speech coding quality greatly and increase value of PESQ. It is of positive significance in the encoding algorithm.
Keywords/Search Tags:Adaptive Codebook, Backward Pitch Search, Self-OrganizingFeature Map (SOFM), Gain Filter, Auditory Features, LPC Mel CepstrumCoefficient (LPCMCC)
PDF Full Text Request
Related items