Study On Deep Learning-Based Speech Quality Assessment

Posted on:2016-05-17

Degree:Master

Type:Thesis

Country:China

Candidate:B Q Wang

Full Text:PDF

GTID:2308330461478015

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

A variety of speech transmission and communication systems have played an important role in the people’s communication. And the speech quality of speech system is the decisive factor of its performance. The subjective assessment of speech quality is reliable, but it has time-consuming implementation process and poor flexibility. Although the correlation of input-to-output based objective assessment and subjective assessment can reach very high, the original input signal hardly gotten is needed as reference. Looking for the output-based objective assessment with high correlation is urgent.The paper proposes a new output-based objective assessment method based on deep learning. It first extracts the features of speech after preprocessing. Then the features are mapped to the corresponding partition of speech quality levels by the deep belief network with trained model parameters for objective prediction.The main points of this paper are as follows:(1) The voice activity detection is used to speech after preprocessing for detecting voice frames which are for feature extraction. The silent frames is removed which is helpful for improving the accuracy of speech quality assessment;(2) The improved Gammatone-Frequency Cepstral Coefficients and Perceptual Linear Predictive Cepstral Coefficients are extracted as features which model the auditory perception better, the correlation of objective and subjective assessment is improved;(3) The deep learning is introduced for learning speech features, and it maps the features to non-linear partition of speech quality levels for getting prediction results. The Fuzzy Support Vector Machines which belongs to surface learning is also used to compare with deep learning. The results show the superiority of introducing deep learning in the speech quality assessment system.Through testing many speech samples, it shows that the proposed output-based objective method is effective. It has good flexibility and robustness. The correlation of predicted results and subjective assessment is high relatively and can reach 0.91.

Keywords/Search Tags:

Speech quality assessment, Gammatone-Frequency Cepstral Coefficients, Perceptual Linear Predictive Cepstral Coefficients, deep learning, Fuzzy Support Vector Machines

PDF Full Text Request

Related items

1	The Research Of Speaker Recognition Based On Vector Quantization
2	Study Of Methods Of Speech Features Extraction Of Ando Tibetan
3	Study On Deep Learning-based Speaker Recognition
4	Study On Synthetic Speech Detection Algorithm Based On Deep Learning
5	Noise-robust Auditory Feature Extraction And Optimization For Speech Recognition
6	Hidden Markov Model Based Automatic Speech Recognition Using Mel Frequency Cepstral Coefficients In Nepalese
7	Discrimination Based On Support Vector Machine Speaker
8	Anti-noise Power Normalized Cepstral Coefficients For Two-level Robust Environmental Sounds Recognition In Real Noisy Conditions
9	Estimation of cepstral coefficients for robust speech recognition
10	Software Design And Implementation Of Voiceprint Recognition Module Based On ARM