Font Size: a A A

An automatic speech recognition oriented study on segmentation, low dimensional feature extraction, and temporal trajectory information capture

Posted on:2003-08-29Degree:Ph.DType:Dissertation
University:City University of New YorkCandidate:Zhu, YonggangFull Text:PDF
GTID:1468390011482107Subject:Physics
Abstract/Summary:
Accurate and efficient automatic speech recognition requires feature vectors highly discriminative for the categories of interest while at a low dimensionality. Recent studies on feature extractions from mel spectra show that classical mel-frequency cepstral coefficients (MFCCs) may not be able to capture some important cues existing in the local spectral correlates. Thus, we study feature extraction together with dimensionality reduction on mel spectra using the hybrid models of neural networks and Euclidean distance proposed by us. This is mainly inspired by the adaptive nature of neural networks. If we use classical MFCCs as a benchmark, features extracted by our hybrid models can give comparable or much better classification rates while with significant dimensionality reduction. Time warping recurrent neural network, aimed to recognize phonemes and CV syllables by efficiently capturing temporal trajectory information, is studied with mel features, MFCCs and our features, and the results suggest that low dimensional features extracted by linear Euclidean neural networks may be better for this purpose.
Keywords/Search Tags:Feature, Low, Neural networks
Related items