Font Size: a A A

Research On Lost Information Reconstruction Method In Speech Time-frequency Domain

Posted on:2022-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y S GuanFull Text:PDF
GTID:2518306755493944Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In daily life,voice information is often lost due to misoperation and storage device damage in the process of audio recording.However,in real-time audio and video communication,there are problems of audio content loss due to network stability,such as data packet delay,loss or jitter.At the same time,the transmission system usually compresses the audio signal to save the transmission bandwidth,and there will be some problems,such as the audio quality degradation caused by the lack of audio high-frequency information.In view of the above-mentioned voice information loss problem,this paper studies the intelligent analysis and compensation of local recorded audio data and network transmitted audio data quickly and accurately based on artificial intelligence technology such as deep learning algorithm,and plans to carry out research in the following three aspects:Firstly,this paper deals with speech continuity inpainting of long speech content loss in offline mode,aiming at exploring the ability of deep learning model to capture the semantic relationship between context and lost speech,and improving the model structure to improve the quality of reconstructed speech.In addition,aiming at the problem of packet loss that needs to be solved urgently in real-time communication,this paper additionally extends the causal system to realize real-time packet loss concealment.Secondly,we study the multidimensional feature extraction of integrated time-domain audio waveform loss and frequency-domain spectral loss restoration to realize a joint reconstruction model of time-frequency domain information loss,which is applicable to realtime narrowband speech transmission systems.This paper also performs the balance of model scale and reconstructed speech quality to achieve the efficiency requirements.Finally,considering the requirements of compatible speech real-time communication applications,this paper studies the joint method of high-quality speech enhancement and speech packet loss concealment for noisy environments,and realizes two deep audio post-processing in real scenarios in the form of cascaded networks.After extensive experiments to adjust the network parameters and model size,the research in this paper effectively reduces the algorithmic delay of deep learning-based speech reconstruction technology,and promotes its practicality while ensuring the model performance.
Keywords/Search Tags:Lost information reconstruction, deep learning, offline speech inpainting, real-time packet loss concealment, bandwidth expansion
PDF Full Text Request
Related items