Font Size: a A A

Research On Low-bit-rate Wideband Speech Coding Algorithms Based On The Sinusoidal Speech Model

Posted on:2007-04-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:N YingFull Text:PDF
GTID:1118360185455287Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With higher requirement of speech quality, the technique development of lowbit rate wideband speech coding is playing more and more an important role in thefields of speech signal processing, communication and network systems.Compared with narrowband speech in general, wideband (50~7000Hz) speechincreases quality which is much natural, real and comfort within 50~200Hz andlegible, understandable within 3.4~7000Hz. This thesis mainly focuses thecharacteristics of the wideband speech and try to develop wideband speech codingalgorithms with the bit rate of 6~16kbit/s.Based on the analysis of the development actuality and demand in low bit ratewideband speech coding techniques, using modern digital signal processingmethods for the study of speech characteristics and their detection algorithms,short time digital signal processing, window functions, voiced/unvoicedclassification, pitch detection and Linear Predictive Coding (LPC) techniques areintroduced. With the aim of the reconstruct of natural speech in lower complicatedcalculations in algorithm, new low bit rate speech coding algorithms based on theSinusoidal Speech Model (SSM) are proposed. In addition, an algorithm usingthird cumulant for voiced/unvoiced classification based on the SSM and analgorithm for the detection of accurate pitch through the least mean square methodwith improving Sub-harmonic to Harmonic Ratio (ISHR) obtained from the SSMare also given. Simulation results indicate that the proposed speech parameterestimation algorithms are robust to the noises.The innovative pursuits in the dissertation can be summarized as thefollowing three aspects.(1) A variable low bit-rate wideband speech coding algorithm based on theBSSM is realized. This algorithm processes 13.92kbit/s bit rate (improved to9.05kbit/s) and during the processing of parameters quantization, it adoptsmulti-form quantize technique, especially getting high reconstruct speechquantity and low bit rate in using fitting quantization to the amplitudesparameters.(2) A Modified Bi-band Mix-excited LPC (MBME-LPC) Algorithm with7.95kbit/s bit rate is developed. First, to the unstricted-periodic characteristics ofspeech signal in frequency domain, the alogorithm adds frequency offsets astransfer parameters to improve the HSSM and guarantee the reconstruct speechquantity.Second, it is advanced to utilize LPC in frequency domain for voicedframes and LPC in time domain for unvoiced frames,so that reduce the computercomplexity and get higher reconstruct speech.(3) Based on the advantages of HSSM in low bit rate coding and speechcharacteristics of non-strict periodicity, according to minimum mean-square errorcriterion, a new Harmonic Phase Model (HPM) of speech is studied and itsoptimized phase estimation model and simplification algorithm for this estimationare presented. With this phase estimation, a new speech coding algorithm in bitrate of 8.85kbit/s is obtained.The principle work on low bit wide band speech coding in this thesis is:(1) A variable rate wideband speech coding algorithm within the 9.03kbit/susing the Basic Sinusoidal Speech Model (BSSM) is realized.(2) Since the complicated speech characteristics are not harmonic wave likemusic, there exist some offsets apace can be used with the variation time. In orderto decrease bit rate more, on the basis of Harmonic Sinusoidal Speech Model(HSSM) and the characteristics of frequency offset to guarantee speech quality, aMBME-LPC Algorithm with 7.95kbit/s bit rate is developed.(3) A new Harmonic Phase Model (HPM) of speech is studied and itsoptimized phase estimation model and simplification algorithm for this estimationare presented. It combines the advantages of parameter coding and wave codingand process high quantity of reconstruct speech.(4) Computer simulation of all speech coding algorithms proposed in thisthesis for the comparison with the international wideband speech coding standardG.722.2 using objective and subjective evaluation criterion is conducted. Theresults show that the algorithms possess high reconstruct speech quality and stronganti-noise capability. To certain extent, they are close to or better than those ofsome speech coding algorithms in G.722.2.First chapter discusses the actuality of speech coding, the requirement of thelow bit rate wideband speech coding and the probability of the SSM used in thealgorithms of low bit rate wideband speech coding.In chapter 2, the techniques of short time speech analysis, window function,voiced/unvoiced classification, pitch detection and LPC are given. And based onthe research, an algorithm of new voiced/unvoiced classification and an improvingSHR pitch detection is presented.Chapter 3 introduces Vector Quantization (VQ) which is popular used in lowbit rate speech coding. Then a VQ process for LPC parameters and phaseparameters in speech coding by taking Dynamic Time Warping (DTW) in split VQto search the codebook in short delay is proposed.In chapter 4, a variable wideband speech coding algorithm based on theBBSM is developed with multiform quantization techniques in quantizing theparameters, especially the matching quantization method used in amplitudeparameters. Then this algorithm is improved to get a 9.05kbit/s bit rate codingalgorithm. At last, by objective and subjective speech quality evaluation thecomparison of reconstructed speech quality and the speech signals to the ITUwideband speech coding standard G.722.2 is made.In chapter 5, considering non-strict periodicity of the voiced speech infrequency domain, for guaranteeing reconstructed speech quantity, by modifyingHSSM, a 7.95kbit/s MBME-LPC speech coding algorithm by adding theparameters of frequency offsets in order to improve the reconstruct speech qualityis proposed. Meanwhile, the algorithm adopts LPC parameters in frequencydomain, which is better than those of LPC in time frequency to increasereconstruct speech quality and reduce the quantize noise. And this algorithm istested through objective and subjective evaluation.Whereas the Phase Model is still a difficult research hotspot in speech codingand processing fields. In chapter 6, for HSSM a new Phase Model to substitute thefrequency offsets of speech non-strict periodicity is studied. Then utilizingminimum mean-square error method, its optimum phase estimation model isdeduced. Applying the new phase model, it was presented a new speech codingalgorithm in bit rate of 8.85kbit/s is presented. Finally, this algorithm is alsoevaluated by objective and subjective test.In the last chapter, a brief summary of this dissertation and the prospectedfurther research work are given.
Keywords/Search Tags:wideband speech coding, low bit rate, sinusoidal speech model, harmonic sinusoidal speech model, Phase Model, the third cumulant, subharmonic-harmonic, the minimum mean-square error method
PDF Full Text Request
Related items