Font Size: a A A

Objective speech quality estimation using Gaussian mixture models

Posted on:2006-08-28Degree:M.Sc.(EngType:Thesis
University:Queen's University at Kingston (Canada)Candidate:Falk, Tiago HenriqueFull Text:PDF
GTID:2458390008465195Subject:Engineering
Abstract/Summary:
In this thesis, we propose the use of Gaussian mixture models (GMMs) as simple, yet effective predictors of perceived speech quality. A large pool of perceptual distortion features is extracted from speech files. Initially, statistical data mining algorithms are used to sift out the most relevant variables from the pool. We show that the five most salient feature variables are sufficient to construct good GMM-based estimators of subjective listening quality. It is shown, however, that the features selected by the data raining schemes limit the performance of the proposed voice quality predictor. To this end, a novel feature selection algorithm that directly optimizes GMM prediction performance is also proposed. The algorithm performs N -survivor search, trading complexity and accuracy via the parameter N. Comparisons with PESQ, the current "state-of-art" speech quality estimation algorithm, show that the proposed algorithm incurs, on average, 26.12% higher correlation and 18.04% lower root-mean-squared error. Tested on unseen data the proposed algorithm is capable of reducing RMSE by an average 41% relative to PESQ.
Keywords/Search Tags:Speech, Algorithm, Proposed
Related items