Font Size: a A A

Research On Detection And Enhancement Of Abnormal Audio Event Based On ICRNN-GRU

Posted on:2021-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:C D ZhuFull Text:PDF
GTID:2428330605950561Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
As the main way of information transmission,audio signal has the advantages of simple acquisition device,convenient mode,small storage space and high privacy.Therefore,audio monitoring makes up for many shortcomings of video surveillance and becomes a security monitoring field.Audio event detection is the core technology of audio monitoring,that is,sudden abnormal events accompanied by abnormal sounds that occur through audio recognition in security monitoring.The traditional audio event detection method mainly uses the feature extraction method and the classifier.The first core problem is still feature extraction,classic audio features and artificial situations.The characteristics of the design are often too specific and incomplete,and there are obvious defects that ultimately lead to deviations in the modeling results.In recent years,deep learning has been proven to be effective in the field of audio event detection and to improve detection results.During the audio event detection process,the surrounding background sounds are often complex and variable,and the appearance of these background sounds will significantly reduce the detection performance.In view of the shortcomings of traditional audio event detection methods,this paper proposes an abnormal audio event detection model combining convolutional neural network and cyclic neural network based on learning and analyzing knowledge related to deep learning,and adopts CRC data enhancement module.Combined with the neural network architecture of the basic recognition module,the algorithm extracts the sound spectrum of the abnormal audio as the feature input,obtains the denoised and enhanced feature spectrum through the data enhancement module,and then obtains the final recognition result through the recognition model.In order to use the deep learning method to train the abnormal audio event detection model,a large number of tagged audio data and the existing annotated audio data are scarce,and the abnormal audio event data set is self-made.The specific research contents are as follows:(1)Homemade abnormal audio event data setA precise abnormal audio event detection model needs to be trained by using a large number of abnormal audio with tags,and the abnormal sound data resources in audio monitoring are very scarce,so the model training is severely limited.To solve this problem,the author collected and produced an abnormal sound event data set,manually tagging each audio.Because noise is the natural enemy of audio monitoring,the abnormal sound in reality is often limited by noise.The relationship between noise and target signal in different scenes is complex and variable,so in addition to abnormal sound,the model can be more robust.Sexually and more appropriately in a variety of public environments,background sound data sets are also collected and created,including background sounds in several common public environments.The algorithm detection experiment combines the abnormal sound data set and the background sound data set according to different signal-to-noise ratios to obtain an abnormal sound event data set with multiple signal-to-noise ratios mixed under different background sounds.(2)An improved anomaly audio event detection algorithm is proposedFirstly,an improved convolutional neural network and a cyclic neural network are combined to form a basic model of CRNN anomalous audio event detection.The CRNN network model can be regarded as an algorithm model that relies on weak labels to predict strong labels.Refers to each piece of independent anomalous audio,given only the entire audio tag,without specifying the frame label for each frame(because not every frame in a tagged category is a target category)The CRNN algorithm can predict the label of each frame of the abnormal audio,or at least every few frames of audio,and finally give the category label of the entire audio through all the frame-level labels.The main steps of the CRNN algorithm are: first,extracting the spectrogram of the abnormal audio as the feature input into the network model;then,the convolution layer automatically extracts the feature sequence from each input spectrogram,and constructs on the convolution network.A cyclic network is used to acoustically model the output feature sequence of the convolutional layer to establish an internal sequence relationship.Finally,the Softmax classification function is used to perform label prediction on each frame or each frame of audio signals to obtain a final prediction result.Performance on the task of abnormal audio detection is better than using a single convolutional network.Then,in the case of large ambient noise and low signal-to-noise ratio,the extracted spectrogram details are so blurred that it loses more useful information and thus affects the recognition.The paper proposes a deep learning-based data.Enhance the module to optimize the spectrogram.The module consists of a simple three-layer network layer,which is a Convolutional-Recurrent-Deconvolutional Neural Network(CRDNN)in sequence.It is an end-to-end data enhancement algorithm.The background noise needs to be analyzed and estimated,and it does not depend on the statistical distribution of the audio signal.The main steps of the CRDNN enhancement algorithm are: extracting the spectrogram of the abnormal sound;inputting into the CRDNN network;and obtaining the enhanced spectrogram of the network output.The enhanced spectrogram is input into the basic model of abnormal audio event detection,and the module is applied to the abnormal audio event detection algorithm in this paper.The experimental results in the abnormal audio data set prove the data enhancement effect of the CRDNN network and the enhancement.The effectiveness and generalization of the network,the residual background noise in the spectrogram is less,and the recognition ability is improved.
Keywords/Search Tags:abnormal audio, audio event detection, deep learning, spectrogram, data enhancement
PDF Full Text Request
Related items