Font Size: a A A

Research And Implementation On 600bps Speech Coding Algorithm

Posted on:2016-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:D D HeFull Text:PDF
GTID:2348330488974477Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
In recent years, although more and more high speed and wide band telecommunication systems have been come into real applications, the low-bit-rate(LBR) and the very-low-bit-rate(VLBR) speech communications are still very important and have been kept much attention on them. This is because the LBR and VLBR speech communications are always the effective tools to provide the safe and secure speech communications. In addition, the LBR and VLBR speech communications do not consume too much transmission bandwidth and therefore can be used in many severe channel conditions. In this thesis, a bit rate of 600 bps speech coding algorithm has been studied and implemented.In the LBR and VLBR speech communications area, the mixed excitation linear predication(MELP) speech coding method is a typical and successful speech coding scheme, it encodes the original speech into the bit rate of 2.4 kbps. In fact, a lot of LBR and VLBR speech have been developed or derived based on the MELP. Since the MELP algorithm is good in performance, this thesis also uses it as the base to develop a 600 bps speech coding algorithm.In our 600 bps speech coding scheme, the speech frame is 25 ms in length, three frames make up a supper-frame. For each frame of the speech, four types of speech parameters are analyzed from the original speech signals in encoder which can be used for the reconstruction of the speech signals in the decoder. Of these parameters, the line spectral frequencies(LSF) are used to represent the vocal track characteristics, the voicing flag to distinguish the voiced and unvoiced frames, the pitch to give the value of fundamental frequency of speech, and the gain to count the energy of the speech frames. The supper-frame that was composed of three consecutive voice frames is encoded jointly with four parameters by using 45 bits.In order to further reduce the bit rate and improve the speech quality, the discontinuous transmission(DTX) functionality based on the voice activity detection(VAD) algorithm and comfort noise generator(CNG) algorithm is utilized in the frame-work of the designed 600 bps speech coding scheme. The DTX uses the VAD algorithm to deal with the noise and the speech separately in the encoder, because of the number of bits used in the noise frame is far less than the voice frame, so it can achieve the purpose of reducing the bit rate. In the decoder, if the noise is detected, in order to guarantee the coherence of the speech, the CNG algorithm is used to generate the corresponding comfort noise.For the practical applications, the designed 600 bps speech codec has been porting to the TI TMS320C6416 DSP and implemented in real-time on the DSP-TMS320C6416 test board, and the codec algorithm has been optimized for the platform as well. The optimization method includes the optimization of the compiler, the optimization of the intrinsics functions and the optimization of the C code, which can reduce the complexity of the algorithm.Tests and evaluations results have shown that, the quality of the designed 600 bps speech codec has a higher intelligibility with natural feeling speech, the perceptual evaluation of speech quality(PESQ) score is about 2.158. In the environment of high SNR, the speech coding algorithm with discontinuous transmission technology can guarantee the speech intelligibility and reduce the bit rate. The calculation complexity of the 600 bps speech codec is about 45 million cycle per second(MCPS) which can be implemented in most existent DSP platforms.
Keywords/Search Tags:Speech Coding, Mixed Excitation Linear Predication, Vocoder, Very Low Bit Rate, 600 bps
PDF Full Text Request
Related items