Font Size: a A A

Research On Sound Source Localization Based On SELDnet

Posted on:2022-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y MinFull Text:PDF
GTID:2518306788455244Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Sound source localization plays an important role in the fields of daily life,social order maintenance,education and rescue.At present,with the development of artificial intelligence technology,the research of sound source location method based on deep learning algorithm is very hot.This method mainly includes signal preprocessing,model construction and training.In this paper,the methods of sound source location and sound event detection based on SELDnet are analyzed,and the existing difficulties are deeply studied.The main work can be summarized as follows:(1)In the sound source localization method,because the audio collected by the collection device inevitably collects the noise in the environment,it is necessary to preprocess the characteristics of the input audio.The purpose is to improve the effective information of the sound source signal,so as to improve the accuracy of algorithm prediction.Using the generalized cross correlation phase transformation(GCC-PHAT)method and principal component analysis method in the microphone sound source localization method based on time delay,the input audio features are transformed into the joint features of spectrogram and GCC-PHAT features,which improves the effective information richness of the input features and reduces the noise content of the original signal.In addition,aiming at the simplification of the function of the deep learning related model,combined with the combination of the sound source localization information and the types of sound source events,this paper uses SELDnet model to synchronously realize the sound source localization and sound event detection functions.(2)In the sound source localization based on deep learning algorithm,the improvement of model calculation rate is an important research field.How to reduce the amount of model calculation while ensuring the accuracy of model prediction is a major difficulty.In order to improve the accuracy of sound source localization based on deep learning and reduce the amount of model calculation,based on convolution recurrent network model(SELDnet),a sound source localization and sound event detection network based on bidirectional simple recurrent unit(Bi SRU-SELDnet)is proposed in this paper.The network takes the original amplitude and phase spectrogram features combined with the GCC-PHAT features extracted by traditional methods as the input to improve the robustness of the input features.The depth separable convolution module is introduced,combined with the bidirectional simple recurrent unit which can be calculated in parallel,to map the input characteristics into two outputs,one output is used for sound event detection and the other output is used for sound source location.Experiments show that Bi SRU-SELDnet model not only ensures the accuracy of sound source localization and sound event detection,but also reduces the training parameters of each batch compared with the baseline model,and significantly improves the network computing speed.(3)Improving the accuracy of model prediction in sound source localization based on deep learning algorithm is another important research field.How to improve the accuracy of model prediction without complex model is also a difficulty.In order to improve the accuracy of sound source localization based on deep learning,a sound source localization and sound event detection network based on multi-head attention(MHA-SELDnet)is proposed in this paper.The network takes the original amplitude spectrogram features combined with the GCC-PHAT features after principal component analysis as the input.The multi-head attention mechanism is introduced to map the input features into two outputs,one for sound event detection and the other for sound source localization.Experiments show that MHA-SELDnet model significantly improves the accuracy of sound source localization and sound event detection without significantly increasing the complexity of the model...
Keywords/Search Tags:Deep learning algorithm, Convolution recurrent network, Bidirectional simple recurrent unit, Generalized cross correlation phase transformation, Principal component analysis, Multi-head attention mechanism
PDF Full Text Request
Related items