Font Size: a A A

Sound Event Recognition Based On Spectrogram Features

Posted on:2021-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:F Z HuangFull Text:PDF
GTID:2428330614953576Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of signal processing technology and computer hardware,many applications of sound event recognition technology has received more and more attention in recent years.It can be used as an auxiliary part of other situational awareness application systems such as audio monitoring systems,robot navigation,and smart wearable devices.The feature extraction and the classification model research for sound event recognition have made great progress through decades' development,and the performance of the corresponding application system has improved greatly.However,due to the structural characteristics of the sound event itself,and often it occurs under noise conditions.Although the sound event recognition system has good recognition performance under low noise conditions,its robustness gradually deteriorates with increase of noise intensity,and the performance of system will be reduced obviously.It has become a new research trend to convert a sound signals into a spectrogram similar to an image,and using image processing techniques to extract relevant features for sound event recognition.However,the difference between the spectrogram and the image has not been fully studied in the existing methods.How to extract effective features from spectrogram is still difficulty of research.The thesis has carried out research on sound event recognition based on spectrogram features,which is based on the close relationship between spectrogram and sound signal.The main works are as follows:1.Sound event recognition method based on texture features of cochleargram is proposed.Firstly,using a Gammatone filter bank to convert a sound event into a gray-scale cochleargram;Secondly,Curvelet transform is performed on the cochleargram to obtain Curvelet subbands with different scales and different directions;Then,using ICLBP to extract the texture features of Curvelet subbands and generate block statistical histograms.The cascade of statistical histograms is used as a new feature of the sound event.The experimental results show that the proposed features can achieve better recognition results for sound events in various noise environments compared with other sound features.2.Convolutional recurrent neural network with multi-sized convolution kernels for sound event recognition is proposed.Firstly,the cochleargram is used as the input layer of the convolutional neural network.The convolutional neural network can effectively deal with the time-frequency changes of the spectrogram by learning filters that shift in time and frequency.To capture features of different range on feature maps,three convolution kernels with different sizes and feature maps are operated by convolution on the last convolution layer,which enriches the extracted features.Secondly,these features are fed to the recurrent neural network,respectively.Recurrent neural network captures the temporal context information about the sound events in spectrogram by integrating the dynamically changing context window in the sequence history;Then,the attention mechanism is applied in the output layer of recurrent neural network to make sound features more distinguishable;Finally,the output of attention layer is spliced to the fully connected layer,and the softmax classifier is used to classify sound events.The experimental results show that the method can obtain better recognition results for sound events under various noise conditions.
Keywords/Search Tags:sound event recognition, spectrogram, Curvelet transform, completed local binary pattern, convolutional recurrent neural network
PDF Full Text Request
Related items