Research And Implementation Of Speech Enhancement Algorithm Based On Hybrid Feature Awareness

Posted on:2023-12-24

Degree:Master

Type:Thesis

Country:China

Candidate:Q He

Full Text:PDF

GTID:2558306845990209

Subject:New Generation Electronic Information Technology (including quantum technology, etc.) (Professional Degree)

Abstract/Summary:

PDF Full Text Request

The difficulty of single-channel speech enhancement tasks is to deal with multisource noisy interferences and various time-varying human speeches in complex and unknown environments.Traditional algorithms suffer from the priori-model-mismatching problem,that cannot handle the complex and dynamic scenarios of intelligent speech interaction applications.Although machine learning algorithms have achieved significant improvements in dealing with burst noise,the performance is highly dependent on the completeness of training set and the scale of the model,which cannot be widely used in small interaction devices.Therefore,to solve these bottleneck problems,this thesis proposes a general speech enhancement framework based on hybrid feature-aware algorithms,which can efficiently improve the real-time scene noise suppression and personalized target speech enhancement capability of existing algorithms.The design and optimization of the network learning strategy is carried out for harsh situations such as background noise robustness,fast training of unfamiliar scenes and complex multi-source scenes.Software and hardware implementation and performance testing of the speech enhancement system are carried out.Detailed contents are listed as follows:(1)A general speech enhancement framework based on hybrid feature-aware algorithms is proposed,consisting of the background noise-aware module based on multihead attention and the target-aware module based on speech phonemes,with the purpose to solve the performance deterioration for unfamiliar or burst noise.The background noise-aware module includes two parts: multidimensional scene-based noise feature bases extraction and background noise feature prediction based on multi-head attention mechanism.The target-aware module extracts personalized phonetic posteriorgram of the target speech in noisy signals to introduce deep semantic information of human speech expression.These multidimensional features are adaptively fused and embedded into any single-channel speech enhancement algorithm to effectively improve the effectiveness of speech enhancement.It is demonstrated that the speech quality and intelligibility evaluation metrics are improved by 6.61% and 2.10% on average in unseen noise scenes,and the subjective experimental results prove that our framework gives a better listening experience.(2)Network learning strategies for complex,unfamiliar,multi-source harsh application environments are proposed,including: background noise robustness enhancement strategy and parameters optimization based on hybrid adversarial learning,fast training and migration for unseen noise scenes based on pre-training and model finetuning,and curriculum learning training strategy for multi-source noisy environments.Experiments demonstrate that our learning strategies can improve speech quality and intelligibility scores by 8.01% and 2.88% in unseen noisy scenes,and reduce training time and better adapt to a wider range of noise scenarios.(3)The software and hardware implementation,algorithm porting and performance testing of the speech enhancement system based on the Jetson AGX Xavier artificial intelligence development module are conducted.An optimization method of the real-time performance of the system based on Tensor RT is implemented.The experimental results show that the computational speed of the speech enhancement system can be increased by an average of 5.4 times,significantly reducing the processing time of the speech enhancement system.The results of this thesis provide a strong theoretical basis,solution and experimental support for the implementation of realistic small-scale real-time speech enhancement system,which can be widely used in the fields of intelligent human-computer interaction,smart home and smart city,smart surveillance,intelligent transportation,with good academic and economic significance.

Keywords/Search Tags:

Speech enhancement, Hybrid feature-aware network, Background noise-aware, Target speech feature-aware, Hybrid adversarial learning, Mel-scaled weighted reconstruction loss function, Curriculum learning strategy

PDF Full Text Request

Related items

1	Single-Channel Speech Enhancement Algorithm Based On Audio Feature Perception
2	Research On Speech Dereverberation Based On Deep Learning Under Complex Environment
3	Research And Implementation Of Lightweight Speech Enhancement Algorithm For Air Control
4	Deep Learning For Robust Speech Recognition
5	Research And Implementation Of Single-channel Speech Enhancement Model Based On Deep Learning
6	Speech Enhancement Algorithm Based On Deep Learning In Complex Background
7	Research On Speech Enhancement Technology In Low Signal-to-Noise Ratio Environment
8	Research On Single-Channel Speech Enhancement Based On Generative Adversarial Network
9	Research On Siamese Network Tracking Algorithm Based On Target-distractor Aware
10	Research On Single-channel Speech Enhancement Method Based On Deep Neural Networ