Font Size: a A A

Compensation for missing voice coding frames in packet transmission systems

Posted on:2001-06-30Degree:Ph.DType:Dissertation
University:Oklahoma State UniversityCandidate:Kohler, Mary AntoinetteFull Text:PDF
GTID:1468390014959091Subject:Engineering
Abstract/Summary:
Scope and method of study. Packet-based communication systems, such as the Internet, are becoming more prevalent for two-way transmission of voice. Packet-switched networks are not always reliable, and the network may lose or excessively delay packets containing voice data. Missing or excessively delayed voice packets cause unacceptable degradations in the speech. The purpose of this study was to develop techniques to compensate for missing speech frames when a packet-switched network is the transport medium. The study evaluated a variety of voice coding algorithms, and included pulse coded modulation (PCM), differential pulse coded modulation (DPCM), μlaw, code excited linear prediction (CELP), mixed excitation linear prediction (MELP), and global system for mobile communication (GSM). The evaluation compared the new techniques to frame repetition, a popular method for dealing with this problem, using both objective and subjective measurements.; Findings and conclusions. Two missing frame compensation techniques emerged from this study: Naturalness Preserving Transform reconstruction and Markov Chain Prediction. Naturalness Preserving Transform reconstruction creates missing speech samples or voice coding parameters using an iterative reconstruction algorithm that exploits the properties of the Naturalness Preserving Transform. Markov Chain Prediction uses weighted probabilities from a transition matrix trained on speech segments to predict the missing parameters of a voice-coding algorithm. Both Naturalness Preserving Transform reconstruction and Markov Chain Prediction outperform frame repetition. Markov chain prediction requires a complex training procedure prior to its operation to create the transition matrix. It operates only in the receiver so it is interoperable with systems that aren't using it. It also uses the quantization technique of the native voice-coding algorithm, so the bit rate is not changed. Naturalness Preserving Transform reconstruction produces the highest quality speech in the presence of missing frames. Over 94% of the users preferred it to frame repetition and the signal-to-noise ratio is 40 to 85 decibels higher, depending on the type of speech. It must have a transformation in the transmitter and inverse transformation at the receiver, and it requires additional quantization that may change the native bit rate.
Keywords/Search Tags:Missing, Voice coding, Naturalness preserving transform reconstruction, Systems, Markov chain prediction, Frame
Related items