Font Size: a A A

Speech Intelligibility Evaluation Based On Audio Characteristics

Posted on:2019-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:F GaoFull Text:PDF
GTID:2348330569979980Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of modern information technology,more and more voice signal processing technologies have been integrated into our lives,such as voice recognition,voice enhancement,and intelligent voice interaction.The maturity of these technologies has greatly facilitated our daily lives.And work.How to accurately assess speech performance has also been a hot topic for many scholars.Nowadays,the evaluation of speech performance is mainly measured in terms of speech quality and speech intelligibility.An accurate and effective speech assessment method not only improves the performance of the communication system,but also indirectly verifies the performance of speech enhancement algorithms.The subjective evaluation of speech has the most real human experience of speech signals,but subjective evaluation takes time and effort,and it is unable to cope with the needs of today's massive speech data evaluation.In recent years,with the computer information processing technology as the core,the objective evaluation method that is closest to human subjective evaluation scores has become a research hotspot.This article first reviews the basic knowledge ofspeech signals and the development history of the speech intelligibility assessment index.This paper briefly describes today's excellent subjective and objective evaluation methods of speech,analyzes the process of improvement and evaluation of objective intelligibility assessment methods,which provides inspiration for the later research of this article.The Gammatone filter bank based on the auditory characteristics of the human cochlea basement membrane is applied to the frequency-domain excitation spectrum filter.The energy spectrum distortion of the excitation spectrum of the speech signal is calculated in combination with the weighted band signal-to-noise ratio method.Compared to the traditional method,the improved method is used in Babble,Car,and Street.The correlation coefficient between subjective evaluation score and background noise environment has improved significantly.This paper studies the relative contributions of vowels and consonants to speech intelligibility.Based on the method of normalized covariance evaluation,the relative mean square of signal to noise ratio(SNR)of speech band excitation spectrum is studied.The Root Mean Square(RMS)value is a threshold for frequency domain segmentation.The voice frequency band is divided into two levels,a high signal to noise ratio band and a low signal to noise ratio band.The Normalize Covariance Measure(NCM)was used to calculate the speech intelligibility objective assessment scores for the two frequency bands.Experimental results show that high SNR speech contains more speechintelligibility information and is close to the overall assessment score.At the same time,the proposed unified model combines the relative contributions of two segments of speech intelligibility.When the weight coefficient is 0.2,the evaluation results of the proposed evaluation model have high correlation with subjective evaluation scores.
Keywords/Search Tags:speech intelligibility, Gammatone, SNR, RMS value, evaluation model
PDF Full Text Request
Related items