Research On Voice Endpoint Detection Method In Noisy Environment

Posted on:2022-06-01

Degree:Master

Type:Thesis

Country:China

Candidate:S Y Luo

Full Text:PDF

GTID:2518306524451874

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

The main purpose of voice endpoint detection is to distinguish the voice segment and non-voice segment from the voice signal,but the voice signal is often accompanied by various noises,and the presence of noise directly affects the performance of endpoint detection.This paper starts from the voice endpoint detection method based on characteristic parameters,and conducts research on voice endpoint detection in noisy environment.The specific research work includes the following aspects:Firstly,in order to solve the problem of poor robustness of the features used in the single feature-based voice endpoint detection method in low signal-to-noise ratio environment,the first dimension coefficient(GFCC₀)of Gammatone frequency cepstral coefficient(GFCC)of the speech signal is introduced into the speech endpoint detection task in this paper,and the endpoint detection of the speech signal is realized by combining the multi-window spectral subtraction method.Using the GFCC₀ feature in four noise environments such as babble and volvo can achieve higher detection accuracy than the spectral entropy method and the logarithmic spectrum distance method.Although the combined multi-window spectral subtraction method will increase the detection time,it can further improve the GFCC₀ feature detection accuracy under low signal-to-noise ratio babble noise and volvo noise.Secondly,aiming at the problem of insufficient endpoint detection performance of the voice endpoint detection method based on multi-feature fusion in a complex noise environment,this paper proposes a fusion feature combining Gammatone frequency cepstral coefficient(GFCC)and Mel frequency cepstral coefficient(MFCC).The GFCC₀ and MFCC₀ features of the speech signal are multiplied to construct the first type of fusion features.The first type of fusion features can achieve effective tracking of the voice segment,but the ability to track unvoiced sounds in the speech segment is slightly insufficient in some noise environments.Thirdly,aiming at the problem of insufficient tracking ability of the first type of fusion features for the unvoiced segment,this paper proposes an adaptive weighted fusion method,which uses the projection feature with strong unvoiced tracking ability and the band-partitioning spectral entropy feature with strong voiced tracking ability to improve GFCC₀.The feature's ability to track unvoiced and voiced sounds is a second type of fusion features that takes into account the tracking capabilities of unvoiced and voiced sounds in the speech segment.Finally,aiming at the problem that the endpoint recognition method with fixed threshold values affects the performance of endpoint detection,this paper uses adaptive estimation double threshold method as endpoint recognition method on the basis of extracting two kinds of fusion features,and realizes endpoint detection of noisy speech signal based on two kinds of fusion features respectively.Experimental results in seven noisy environments such as babble and volvo show that the first type of fusion features can effectively improve the accuracy of endpoint detection under five noisy environments,while the second type of fusion features achieves better results than comparison algorithms under seven noisy environments.Especially in the volvo noise environment,the detection accuracy can reach more than 94.5%.

Keywords/Search Tags:

voice endpoint detection, Gammatone frequency cepstral coefficient(GFCC), Mel frequency cepstral coefficient (MFCC), band-partitioning spectral entropy, multi-feature fusion

PDF Full Text Request

Related items

1	Research On Tibetan Voice Activity Detection Algorithm
2	Research And Implementation Of Chinese Continuous Speech Recognition System
3	A Study On Robust Speech Endpoint Detection Algorithms In Noisy Environment
4	Timbre Recognition Of Western Instruments
5	The Research Of Speaker Recognition Under Noisy Environment
6	Research On MFCC Characteristic Parameters And Kernel Function Selection Based On Support Vector Machine
7	Research On Speech Recognition Feature Extraction Algorithm And Soundprint Attendance System Realization
8	Anti-noise Power Normalized Cepstral Coefficients For Two-level Robust Environmental Sounds Recognition In Real Noisy Conditions
9	Study On Voice Activity Detection Methods In Heavy Noise Environments
10	Speech Recognition Access Control Applications