The Research On Restoration Of Throat Microphone Speech

Posted on:2013-12-29

Degree:Master

Type:Thesis

Country:China

Candidate:D W Feng

Full Text:PDF

GTID:2248330374481484

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

The throat microphone is a kind of transducer which can pick up sound by the skin vibration near the throat. The throat microphone speech is intelligible, but sounds unnatural. The Throat Microphone (TM) picks up speech that is transmitted from the pharynx region, and the’buzz tone1of the larynx. A study of the acoustic characteristics of various sound units in the TM and Normal Microphone (NM) speech shows that the TM and NM signals differ in the vocal tract characteristics as well as in the characteristics of the excitation source for different sound units. However, though there are acoustical differences, there exist some common features (for example, pitch and location of formants) in the simultaneously recorded TM and NM speech of a speaker. The main objective of this thesis is to improve the naturalness and the perceptual quality of the TM speech.Artificial Neural Network-based voice conversion method was presented to improve TM speech quality. Artificial Neural Network is used to obtain a smooth mapping of the TM spectrum onto the NM spectrum for each frame. After analyzing the acoustic and spectral characteristics of TM speech, the main differences between TM and NM speech was studied. Therefore modify the characteristic parameters of the vocal tranct function was necessary. The modified methods should choose in voice conversion methods.Through comparing the acoustic characteristics of cepstral coefficients, line spectrum frequency and Mel-frequency cepstrum, Mel-frequency cepstrum is a better representation of TM speech. Because design of the mel-frequency cepstrum was considered with perceptual factors. Through comparing the converted voice via GMM and ANN, the quality of GMM-based converted voice is better.One of the advantages of using a throat microphone is that it provides a high Signal-to-Noise Ratio (SNR) over speech frequency range in the noise environment. This thesis explores the presence of speech, speaker and language characteristics in the TM speech for developing speech systems. The entire conversion system includes a training component and a conversion part. In training stage, mapping various parts of the speech signal model, including the excitation and vocal tract. The mapping model using neural network and Gaussian mixture model (GMM) were conducted. The acoustic features include Cepstral Coefficients, the Line Spectrum Pair, Mel-frequency cepstrum, etc were used in the conversion work. Finally, the converted voice was evaluated. The conversion quality evaluation including a subjective evaluation and a objective evaluation.As the TM speech is relatively immune to noise, this study may promote in strong noise environment.

Keywords/Search Tags:

Throat Microphone, Voice Conversion, Artificial Neural Network, Gaussian mixture model, Mel-frequency cepstrum

PDF Full Text Request

Related items

1	Key Algorithm In High Quality Voice Conversion System
2	The Research Of Voice Conversion Based On The Spectral Parameters Of Vocal Tract
3	Research On Technologies Of Voice Conversion Based On Gaussian Mixture Model
4	Voice Conversion Algorithm Based On The Acoustic Characteristics Of Personality Study
5	Voice Conversion Based On GMM And Codebook Mapping
6	The Research Of Extracting Of Pathological Voice's Characteristics And Recognition Based On Wavelet Transformation And Gaussian Mixture Model
7	The Research On Vocal Tract Spectrum And Pitch Frequency Transformation In Voice Conversion
8	Research On Methods For Voice Covnersion
9	Research On Modeling Methods For Voice Conversion
10	Research On The Chinese Voice Conversion System Based On GMM