Font Size: a A A

Research On Sound Event Detection System Based On Neural Network

Posted on:2022-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y J WangFull Text:PDF
GTID:2518306764979249Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Sound event detection(SED)is a technology that uses the characteristics of sound signals to predict the types of sound events.It has a broad application prospect in smart home,public security and other fields.The traditional voice event detection technology is generally based on GMM-HMM model.Its recognition accuracy is low,and the coding and decoding computational complexity is large,so it is difficult to be applied in real life.Compared with traditional machine learning methods,researchers at home and abroad have proposed a detection method based on neural network(NN)in recent years,which significantly improves the recognition accuracy.However,a major problem with NN based sed algorithms is that they usually involve a large number of parameters and floating point operations(FLOPs),resulting in high processing delay and hardware overhead,making NN based methods generally difficult to apply to IOT devices requiring low latency and low storage.Therefore,building a sound event detection algorithm with low network complexity and high recognition accuracy has become the focus of this thesis.This thesis designs a low complexity and high accuracy lightweight sound event detection algorithm,and implements a sound event detection system based on FPGA-DPU.The main work of this thesis is as follows:Firstly,due to the high parameter and FLOPs in the current audio event detection algorithm,a selective separable convolution mechanism is used in this thesis.This mechanism can effectively reduce the parameters and FLOPs of the algorithm,and achieve a high recognition accuracy.Then,in order to improve the recognition accuracy of voice event detection algorithm while maintaining low algorithm complexity,a coordinated attention mechanism is used.This mechanism basically does not increase the complexity of the algorithm,and can act on the channel domain,time domain and frequency domain at the same time,so that the detection algorithm can focus on the features and regions related to sound event detection,and reduce the attention to the regions that have little impact on the detection task.Then,the lightweight sound event detection algorithm in this thesis is implemented by the deep learning processing unit(DPU)of FPGA,and a sound event detection system based on FPGA-DPU is constructed.The system is developed and designed based on ZCU104 platform,and the DPU deployment is completed by using Vivado2020,DNNDK and Peta Linux development platforms.Finally,it is tested and analyzed on the commonly used sound event detection data sets(ESC-50,ESC-10 and Urban Sound8K).The total parameters of the lightweight sound event detection algorithm designed in this thesis is only 0.246 M,and the FLOPs of the algorithm is only 203 M.The accuracy of the ESC-50 data set is 87.3%.In the audio event detection system based on FPGA-DPU,the average recognition time of a single audio in ESC-50 and ESC-10 data sets is 8.24ms(Urban Sound8 K is 6.6ms),which fully meets the real-time requirements.
Keywords/Search Tags:Sound event detection, neural network, low complexity
PDF Full Text Request
Related items