Research On Sound Source Localization Based On SELDnet

Posted on:2022-11-01

Degree:Master

Type:Thesis

Country:China

Candidate:Y Min

Full Text:PDF

GTID:2518306788455244

Subject:Automation Technology

Abstract/Summary:

PDF Full Text Request

Sound source localization plays an important role in the fields of daily life,social order maintenance,education and rescue.At present,with the development of artificial intelligence technology,the research of sound source location method based on deep learning algorithm is very hot.This method mainly includes signal preprocessing,model construction and training.In this paper,the methods of sound source location and sound event detection based on SELDnet are analyzed,and the existing difficulties are deeply studied.The main work can be summarized as follows:(1)In the sound source localization method,because the audio collected by the collection device inevitably collects the noise in the environment,it is necessary to preprocess the characteristics of the input audio.The purpose is to improve the effective information of the sound source signal,so as to improve the accuracy of algorithm prediction.Using the generalized cross correlation phase transformation(GCC-PHAT)method and principal component analysis method in the microphone sound source localization method based on time delay,the input audio features are transformed into the joint features of spectrogram and GCC-PHAT features,which improves the effective information richness of the input features and reduces the noise content of the original signal.In addition,aiming at the simplification of the function of the deep learning related model,combined with the combination of the sound source localization information and the types of sound source events,this paper uses SELDnet model to synchronously realize the sound source localization and sound event detection functions.(2)In the sound source localization based on deep learning algorithm,the improvement of model calculation rate is an important research field.How to reduce the amount of model calculation while ensuring the accuracy of model prediction is a major difficulty.In order to improve the accuracy of sound source localization based on deep learning and reduce the amount of model calculation,based on convolution recurrent network model(SELDnet),a sound source localization and sound event detection network based on bidirectional simple recurrent unit(Bi SRU-SELDnet)is proposed in this paper.The network takes the original amplitude and phase spectrogram features combined with the GCC-PHAT features extracted by traditional methods as the input to improve the robustness of the input features.The depth separable convolution module is introduced,combined with the bidirectional simple recurrent unit which can be calculated in parallel,to map the input characteristics into two outputs,one output is used for sound event detection and the other output is used for sound source location.Experiments show that Bi SRU-SELDnet model not only ensures the accuracy of sound source localization and sound event detection,but also reduces the training parameters of each batch compared with the baseline model,and significantly improves the network computing speed.(3)Improving the accuracy of model prediction in sound source localization based on deep learning algorithm is another important research field.How to improve the accuracy of model prediction without complex model is also a difficulty.In order to improve the accuracy of sound source localization based on deep learning,a sound source localization and sound event detection network based on multi-head attention(MHA-SELDnet)is proposed in this paper.The network takes the original amplitude spectrogram features combined with the GCC-PHAT features after principal component analysis as the input.The multi-head attention mechanism is introduced to map the input features into two outputs,one for sound event detection and the other for sound source localization.Experiments show that MHA-SELDnet model significantly improves the accuracy of sound source localization and sound event detection without significantly increasing the complexity of the model...

Keywords/Search Tags:

Deep learning algorithm, Convolution recurrent network, Bidirectional simple recurrent unit, Generalized cross correlation phase transformation, Principal component analysis, Multi-head attention mechanism

PDF Full Text Request

Related items

1	Research On Sensor Activity Recognition Based On Improved Deep Recurrent Neural Network
2	Research On Hierarchical Text Emotional Classification Based On Deep Learning
3	Research Of Deep Cross-network Recommendation Algorithm Based On Gated Recurrent Unit And Attention Mechanism
4	The Cross-site Script Detection Based On Deep Learning
5	Application Of Improved Deep Learning Algorithm In Chinese Text Classification
6	Research And Application Of Text-based Sentiment Analysis Technology
7	Research On Text Sentiment Analysis Based On CNN-RNN Deep Learning Model
8	Study On Speech Quality Assessment Based On Deep Learning
9	Question Classification Based On Deep Learning Model
10	Research On Single Channel Speech Enhancement Based On Multi-head Attention Mechanism