Font Size: a A A

Research On Parameter Representation And Objective Quality Assessment Of Speech

Posted on:2001-11-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q FuFull Text:PDF
GTID:1118360002951300Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Representation and spectrum distortion measure of speech signals have been a couple of vital issues in research of speech processing. Although many effective theory and methods in this area have been formed, the research still remains quite active along with further understanding of speech signals as well as flourishing development of various applications of speech processing. Objective assessment of speech quality is a direct application of this theory. This dissertation focuses on the researches of parameter representations of speech signal and their corresponding distortion measures, as well as their applications to objective assessments of speech quality. The major contributions of this dissertation are as follows: 1. The dissertation proposes BSCC (Bark-Scale Cepstrum Coefficient) distortion measure, where a kind of Cosine-fringed Critical Filter-bank, instead of Mel-frequency triangle filter-bank, is employed to frequency warping. Theoretical analysis and experimental results show that the BSCC distortion measure behaves better than MFCC (Mel- Frequency Cepstrum Coefficient) distortion measure and has comparable performance with Hark spectrum distortion measure. However, its computational complexity is similar as that of MFCC, and much lower than that of Bark spectrum distortion measure. Therefore, it would be practically useful for applications of speech processing in real- time. 2. The dissertation proposes a novel wavelet transform, i.e. Bark wavelet transform, which is based on bark frequency scale concept coming from the speech perceptual experiment. In the sense of mathematics, it is a kind of wavelet transform of non- orthogonal, over-complete, reversible and self-inverting, while its property coincides with the property of human cochlea filter. As a method of feature extraction for speech recognition, it has higher time resolution when its frequency resolution is comparable with MFCC, since it can takes shorter analysis frame thanks to the property of local base of wavelet transform. Experiments of consonant recognition suggest that the feature based on Bark wavelet transform is remarkably superior to MFCC. 3. Based on a well-prepared speech signal database and some results of thorough subjective assessment, we have built up a system of speech quality objective assessment. To build this system, we have performed a lot of explorations for various methods anddistortion measures, such as LPC cepstrum, MFCC, Bark spectrum, (weighted) Log spectrum and their combination, etc.. Moreover, we have well solved, using a novel method, the problem of synchronization between the original speech signal and the distorted signal resulted from communication system transmission. Therefore, the system is practical applicable. 4. The dissertation proposes a neural-network-based method for objective assessment of speech quality. The method is a strategy of single-step, implemented with a feed- forward neutral network. In the conventional methods, there are usually two steps to give MOS (Mean Opinion Score) estimation, i.e. 1) calculating average distortion; 2) mapping from the average distortion value to MOS estimation by means of non-linear regressive analysis. The proposed method combines the two steps into one step. It can adequately embody the perception properties of the human auditory system with simply computing, by passing the problem caused by assumption for mathematical model of distortion measure and regressive analy...
Keywords/Search Tags:representation of speech signals, spectrum distortion measure, wavelet theory, objective assessment of speech quality, neural network
PDF Full Text Request
Related items