Font Size: a A A

Study On Matching Pursuit Based Low-bit Rate Speech Coding

Posted on:2003-07-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:W Y ZhangFull Text:PDF
GTID:1118360095456146Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The speech coding technology has achieved high quality of reconstructed speech at high-bit rate and medium-bit rate. For low-bit rate and even very-low-bit rate, however, to achieve high speech quality is still a challenge problem that has important significance in theory and potential application value in practice. This makes lots of researchers explore new methods and techniques for the goal, such as techniques for sinusoidal modeling and methods for parameter quantization, and so on. Following the direction of sinusoidal modeling and sinusoidal analysis, this thesis adopted the matching pursuit techniques along with the psychoacoustic model, explored some novel methods for sinusoidal modeling as well as the quantization of model parameters, and discussed the low bit rate speech coding and its related problems. The major contributions of this thesis are included in the following:1. The matching pursuit techniques are applied to enhance speech signal, and a method to determine the threshold of coherent ratio is provided in the enhancement procedure based on matching pursuit. With the method, the noisy signal can be efficiently enhanced in a rather wide range while the statistical property of signal and noise is unknown.2. The sinusoidal modeling based on matching pursuit is studied in this thesis, and the concepts of dynamic masking threshold and perceptual gradient are proposed as well as the algorithm of sinusoidal modeling with perceptual gradient. The newly proposed method makes good use of the psychoacoustic model. And the perceptual information contained in the synthesized signal is increased in a furthest way during the modeling procedure. Therefore the efficiency of modeling is improved. The quality of the synthesized speech by this approach is rather high even though the model precision is low.3. In order to encode the parameters of sinusoidal model, the vector quantization techniques for amplitude parameters and the differential quantization for frequency parameters are proposed and discussed. At the same time, the frequency bin model, the random phase model and the zero phase model are also discussed. All of these reduce efficiently the coding bit rate.4. Aimed at the reduction of bit rate and the improvement of speech quality, a serial of speech coding schemes are studied in a gradual refinement way, and an integrated coding scheme at 1.5-2.4kbps is presented finally. With different modeling methods and quantization techniques, the speech compression schemes discussed in this thesis include: the compression based on general matching pursuit sinusoidal modeling, the compression based on sinusoidal modeling with perceptual gradient, the compression based on dynamic dictionary matching pursuit, the compression scheme using classified dynamic dictionaries, and the integrated compression scheme that combines the sinusoidal modeling with perceptual gradient and the classified dynamic dictionaries. From these schemes it can be seen that matching pursuit based sinusoidal modeling has great potential in low bit rate speech coding, and provides a new way to study this problem. The finally proposed compression scheme takes more psychoacoustic effects into consideration, and takes the advantage of classified process, dynamic dictionary and sinusoidal modeling with perceptual gradient. Both of its bit rate and speech quality are superior to some existing international coding schemes and standards.5. A function named CAMDF is proposed as well as the CAMDF-based algorithms for speech classification and pitch estimation. The algorithms are used for the coding schemes in this thesis. Because the CAMDF conquers the defect of traditional AMDF, the new pitch detection algorithm not only efficiently decreases the estimation errors, but also simplifies the detection process and improves the precision of estimated value. Speech classification using CAMDF also obtains satisfying results.Finally, the key points of the thesis are summarized, some improvements to be done in the...
Keywords/Search Tags:Matching Pursuit, Speech Coding, Speech Enhancement, Sinusoidal Modeling, Pitch Estimation, Psychoacoustic Model
PDF Full Text Request
Related items