In the air control process of civil aerospace,the speech is the most common interaction tool.In such a ground-air communication scenario,it’s often in a complex noise environment,which greatly affects the quality and efficiency of ground-air communication.This thesis mainly research the deep learning based speech enhancement methods to slow down the imqpact in noise interference problem in such scene.Through the in-depth analysis,summary and induction of the current mainstream speech enhancement research,a series of studies have been carried out around the main problems:one is the self-adaptation problem of speech enhancement in complex noise environments,generally based on convolutional neural networks.Due to its fixed parameter operation mode,the speech enhancement model is difficult to adapt to the generalization problem in different noise environments,which reduces the ability of the algorithm model to extract features;The problem of spatio-temporal differences,especially in the research process of speech enhancement methods based on the time-frequency domain,often ignores the difference between different frequency bands and different timestamps in the time-frequency feature,especially in the complex noise background condition with low signal-to-noise ratio,which makes it difficult for existing deep learning networks to achieve efficient mapping to clean speech signal.In this thesis,concentrated technical research is done on the above-mentioned algorithm problems,and the specific research contributions are as follows:1)This thesis proposes a speech enhancement method based on adaptive convolution and progressive learning.Taking the encoder-decoder based on redundant convolution as the main network structure of the algorithm,a method of using adaptive convolution instead of general convolution is proposed,and the appropriate multi-scale convolutionbased convolution module is dynamically selected by different input conditions.The abstract feature aggregation method of the network and the adjustment of the overall receptive field of the module improve the generalization ability of the model for different noise environments and the stability of the model under the condition of low snr;besides,the two-stage progressive learning scheme is used.The training method of neural network parameters effectively improves the depth of the network and the overall feature abstraction ability of the network at the cost of a very small computational load.Finally,through the experimental results under the same conditions which verify the better performance of our model compared to the state-of-art models on all metrics.2)This thesis proposes a speech enhancement algorithm based on regional convolution.An algorithm using regional convolution is proposed,which provides a regional differential convolution scheme for the convolutional network structure,and solves the problem of limited ablity of feature extractor which caused by the fixed size of the receptive field in general convolutional neural networks and the parameter sharing mechanism.This problem affects the model’s ability to remove noise.The method proposed not only improves the overall performance of the model,but also improves the generalization ability of the model,and enhances the model’s ability to deal with complex noise;besides,the expanded linear gating unit layer is used to further improve the overall receptive field of the model.Enhanced speech results from the optimized model output.3)In order to show the application effect of the algorithm more intuitively,relying on the envy of civil aviation air control,the research and development of the prototype system of speech enhancement algorithm is carried out on the basis of the algorithm research mentioned in this thesis. |