Font Size: a A A

Speech processing and modeling using a non-linear time-frequency algorithm

Posted on:2009-08-22Degree:Ph.DType:Dissertation
University:Clarkson UniversityCandidate:McNamara, David MFull Text:PDF
GTID:1448390005955211Subject:Engineering
Abstract/Summary:
This dissertation presents a new approach to processing speech. Two distinct techniques are presented. The proposed techniques are based on modeling speech as a multiplicity of sinusoidal components. Sinusoidal modeling of speech is widely used given its capability of framing speech in a simple mathematical context. In this context, speech is modeled by individual sinusoidal components, parameters each of which can then be estimated by a variety of signal processing algorithms. This framework allows for speech processing to be coupled with well-known Fourier-based algorithms as well as classic frequency-domain-based filter techniques. Since sinusoidal modeling is easy to understand and implement, it has been frequently applied to speech and as such is widely reported in the literature. Arguably, all of the existing work in the area of speech processing fails to accurately model non-stationary parameters of a non-stationary sinusoidal component, namely amplitude, phase, and frequency. Instead, speech is assumed to be stationary over a small window of time; while this assumption often works in practice, it is not theoretically well-grounded and as such the true sinusoidal parameters are not accurately modeled; as well, it implies difficulty with handling intra-frame parameters. Both of the proposed methods are based on a non-linear time-frequency algorithm that is capable of providing high-resolution estimates of instantaneous frequency and amplitude of non-stationary sinusoids.;Presented material includes the motivation behind the development of the techniques and an account of the developed methodology. Numerical simulations, as well as examples of real speech signals, are presented to illustrate the performance of the proposed techniques. The techniques were developed with the specific target application of cochlear implants. However, the developed methodology can serve as a framework for speech processing, potentially to be employed in a wide array of speech processing applications.;One of the proposed speech processing techniques follows a traditional VOCODER topology of breaking speech into channels by means of bandpass filtering for further processing. The other one is based on the harmonic relationship that exists between speech components, in which speech components exist at integer multiples of the fundamental or pitch frequency. A pitch estimation technique has been developed that works to derive estimates for all harmonic components.
Keywords/Search Tags:Speech, Processing, Frequency, Techniques, Modeling, Components, Proposed, Developed
Related items