Research And Application Of Speech Enhancement And Recognition For Urban Traffic

Posted on:2022-12-15

Degree:Master

Type:Thesis

Country:China

Candidate:W X Tai

Full Text:PDF

GTID:2492306764977399

Subject:Computer Software and Application of Computer

Abstract/Summary:

PDF Full Text Request

Reducing background noise to improve speech quality and realize automatic speech recognition is a long-term hot topic in speech acoustic processing and other related fields.Among them,speech enhancement refers to the noise removal of noisy speech,which is often used as a front-end preprocessor for other acoustic tasks,and speech recognition refers to the automatic transcription of speech signals into text content by computers.Although deep learning-based speech enhancement and recognition have been studied for nearly a decade,how to maintain the performance of speech algorithms in complex scenarios is still one of the hot issues today.In general,there are five main challenges for complex scenes:（1）In scenes with a low signal-to-noise ratio,speech structural information is submerged in noise,resulting in failure of speech feature extraction （2）The noise intensity in complex scenes is dynamic and changeable,which increases the application scope of the speech enhancement system;（3）Under the unseen situation,the discriminative speech enhancement algorithm has poor generalization and easy to fail;（4）The limited personal resources cannot meet the resource（data,computing power）requirements of speech-related tasks;（5）The distortion issue from speech enhancement seriously affects the accuracy of speech recognition.In order to address challenges（1-2）,this thesis proposes three speech enhancement algorithms for complex scenarios,namely: information distillation-based IDANet,collaborative learning-based SECL,and iterative learning-based SEIL.Experimental results show that,compared with the latest speech enhancement algorithms,the proposed three algorithms can significantly improve speech quality.To address the challenge（3）,this thesis explores the potential value of generative algorithms in speech enhancement tasks and proposes a noise-aware conditional diffusion model dubbed NA-CDiffu SE.Compared with discriminative models,the proposed algorithm is less susceptible to the overfitting problem and exhibits stronger generalization.Besides,compared with the existing diffusion model-based speech enhancement methods,NA-CDiffu SE improves the voice quality,showing significant advantages.To address the challenge（5）,this thesis designs a multi-task-based cascaded model,using the additional guidance from downstream tasks to constrain the training process of speech enhancement.At the same time,this thesis introduces transfer learning,contrastive learning,and knowledge distillation together to address the challenge（4）.By cascading the speech enhancement model and lightweight speech recognition model,the designed algorithm can effectively improve the accuracy of speech recognition in noisy scenarios while maintaining efficiency.Based on the above research,this thesis designs and implements a speech system for urban traffic scenarios.The system allows recording/uploading speech and integrates a variety of enhancement and recognition algorithms.On the basis of displaying algorithms,the system can also compare different algorithms.As a result,this system can meet the basic needs of speech enhancement and recognition.

Keywords/Search Tags:

speech enhancement, speech recognition, information distillation, collaborative learning, contrastive learning

PDF Full Text Request

Related items

1	Design And Implementation Of Real-time Civil Aviation Speech Recognition Algorithm Based On Deep Learning
2	Research And Application Of Speech Enhancement And Recognition For Civil Aviation Air And Land Calls
3	Design And Research Of Voice Interaction System For Intelligent Wheelchair Bed
4	Research On Infant Speech Detection Algorithm Base On Machine Learning
5	Research On Applying The Speech Control Technology In The Shipborne Navigation System
6	Research On Nautical Speech Recognition Method Based On Deep Learning
7	Research And Application Of Speech Enhancement Algorithm System For Controlling Speech Commands In Civil Aviation
8	Research On End-to-End Speech Recognition Of Civil Aviation Radiotelephony Communication Based On Deep Learning
9	Research On Intelligent Processing Technology Of VHF Speech Based On Deep Learning
10	Research On Ship Command Speech Recognition Technology Based On Deep Learning