Research On Lost Information Reconstruction Method In Speech Time-frequency Domain

Posted on:2022-12-01

Degree:Master

Type:Thesis

Country:China

Candidate:Y S Guan

Full Text:PDF

GTID:2518306755493944

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

In daily life,voice information is often lost due to misoperation and storage device damage in the process of audio recording.However,in real-time audio and video communication,there are problems of audio content loss due to network stability,such as data packet delay,loss or jitter.At the same time,the transmission system usually compresses the audio signal to save the transmission bandwidth,and there will be some problems,such as the audio quality degradation caused by the lack of audio high-frequency information.In view of the above-mentioned voice information loss problem,this paper studies the intelligent analysis and compensation of local recorded audio data and network transmitted audio data quickly and accurately based on artificial intelligence technology such as deep learning algorithm,and plans to carry out research in the following three aspects:Firstly,this paper deals with speech continuity inpainting of long speech content loss in offline mode,aiming at exploring the ability of deep learning model to capture the semantic relationship between context and lost speech,and improving the model structure to improve the quality of reconstructed speech.In addition,aiming at the problem of packet loss that needs to be solved urgently in real-time communication,this paper additionally extends the causal system to realize real-time packet loss concealment.Secondly,we study the multidimensional feature extraction of integrated time-domain audio waveform loss and frequency-domain spectral loss restoration to realize a joint reconstruction model of time-frequency domain information loss,which is applicable to realtime narrowband speech transmission systems.This paper also performs the balance of model scale and reconstructed speech quality to achieve the efficiency requirements.Finally,considering the requirements of compatible speech real-time communication applications,this paper studies the joint method of high-quality speech enhancement and speech packet loss concealment for noisy environments,and realizes two deep audio post-processing in real scenarios in the form of cascaded networks.After extensive experiments to adjust the network parameters and model size,the research in this paper effectively reduces the algorithmic delay of deep learning-based speech reconstruction technology,and promotes its practicality while ensuring the model performance.

Keywords/Search Tags:

Lost information reconstruction, deep learning, offline speech inpainting, real-time packet loss concealment, bandwidth expansion

PDF Full Text Request

Related items

1	Speech Compensation Algorithm Based On Deep Learning In VoIP Communication
2	Research On Speech Bandwidth Extension Using Deep Neural Network
3	Packet Loss Concealment Of Speech Transmissions Based On Compressive Sensing
4	Voice over Internet Protocol (VoIP) Packet Loss Concealment (PLC) by redundant transmission of speech information
5	Research On The Waveform Domain Anti Packet Loss Technology In Speech Communications
6	Voice Packet Loss Concealment Algorithm Based On VOIP
7	Research On Technologies Of Packet Loss Concealment For Mobile Audio Coding
8	Research And Implementation Of An Intelligent Inpainting System For Image Frames Oriented To Live Streaming
9	Research On NetEQ Technology In WebRTC Voice Engine
10	Research On Packet Loss Compensation Technology Over Video Transmission