Font Size: a A A

Research On Low Bit Rate Audio Coding

Posted on:2004-04-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:RAED S.H. AL-MOUSSAWYFull Text:PDF
GTID:1118360155474036Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
This thesis is concerned with the development and optimization of a signal model for scalable perceptual audio coding. A hybrid, signal-adaptive model for audio consisting of sines + transients + noise (STN) is presented. The STN model is ideally suited for application in quality-scalable and rate-scalable algorithms for efficient transmission or storage of CD-or FM-quality audio at low to medium bitrates, i.e., between 6 and 64 kilobits per second (kbps). A perceptually-based algorithm is proposed for extracting sinusoidal STN model components. The algorithm is an analysis-by-synthesis overlap-add sinusoidal model based on conjugate matching pursuit (MP). Psychoacoustics of the human hearing is explicitly included into the algorithm through modifying the MP metric by the time-varying masking threshold of the input signal. The algorithm is shown to give very accurate selection of most perceptually-relevant sinusoidal STN components, particularly for the sparse representations of the model parameters that are ultimately associated with low-rate coding applications. A unique processing structure is proposed for STN transients/pre-echo control. The proposed methodology relies on a consistent use of the STN sinusoidal model and hence it avoids the use of critically sampled filter banks or other non-parametric, potentially bit-intensive alternative for the representation of transients and elimination of pre-echo artifacts. An effective compression scheme is developed for the STN model parameters. The compression scheme is based on exploiting long-term correlations of STN parameter trajectories over time to achieve both very high coding gains and robustness to packet losses. Informal listening results have shown that the STN codec gave a competitive performance to transform coders at high bit rates and outperformed parametric coders at low bit rates. The STN system is a step towards bridging the gap between low-rate, low fidelity parametric coders and high-rate, high-fidelity transform coders based on critically sampled perfect reconstruction filter banks. In addition, the STN codec allows for high-quality time-scale and pitch-scale modifications in the coded domain.
Keywords/Search Tags:Perceptual Audio Coding, Parametric Coding, Sinusoidal Modeling, Multiresolution analysis, Psychoacoustics
PDF Full Text Request
Related items