Font Size: a A A

The Research And FPGA Implementation Of Sound Feature Extraction And Recognition Algorithms

Posted on:2022-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:J ChaiFull Text:PDF
GTID:2518306524475484Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Sound signals are everywhere in people’s lives.As one of the most common signals,sound signals are the source of information that not only allows humans to build knowledge of the outside world,but also allows humans to express their own thoughts.Nowadays,sound recognition is widely used in many fields such as nature conservation,navigation,security monitoring many other fields.There are two main research directions of sound recognition technology,namely,sound feature extraction algorithm and recognition algorithm.The existing algorithms of sound feature extraction and recognition have high complexity and large amount of computation,which are usually implemented on software platform.They are difficult to meet the requirements of low power consumption and real time of edge computing equipment.In view of the above problems,this thesis chooses Field Programmable Gate Array(FPGA)as algorithms realization platform,studies the simplification and improvement and implementation on FPGA platform of sound feature extraction and recognition algorithms.This thesis mainly carries out the following research:1)This thesis studies the existing sound feature extraction algorithm and sound recognition algorithms.Sound mel frequency spectral coefficients extraction algorithm and the convolutional neural network recognition algorithm with better performance and suitable for hardware realization are selected.2)Based on open sound data sets,specific sound types of sound data are selected according to application requirements.Sound training and test data sets are built by this thesis,including 6 kinds of sound,a total of 5900 sound data,to increase the diversity of sound data sources and ensure the uniformity of different kinds of sound data.3)In this thesis,the original algorithms are improved to reduce the consumption of resources in FPGA platform.The original sound mel frequency spectral coefficients extraction needs a long time to the sound input and the convolutional neural network algorithm has a large network structure.In this thesis,the duration of sound input is reduced and the number of network parameters is reduced by changing the network structure.The accuracy of the improved algorithm differs 0.02% from the original algorithm after fixed-point simulation.4)The improved mel frequency spectral coefficients extraction and convolutional neural network recognition algorithms are implemented on FPGA platform.The feature extraction part of mel frequency spectral coefficients only extracts the effective data of mel filter bank,which reduces the amount of data stored.The convolutional neural network uses module reuse to reduce the amount of resources.In this thesis,compared with the original algorithms of sound feature extraction and recognition,the input time of sound was reduced to 1.61 s,which is 83.9% lower.The data of sound mel frequency spectral coefficients characteristic map is reduced by 83.94%,which increases the real-time performance and reduces the parameters and computation.The number of parameters of convolutional neural network is reduced by 65.63%.The sound feature extraction and convolutional neural network recognition algorithms are implemented on FPGA platform,and the average accuracy of actual test sound recognition is 88.33%(Using the self-built sound data set,the average accuracy of voice recognition was 90.9%).The sound recognition system realized in this thesis is applied to the early warning system of the intelligent Internet of Things as a terminal device to monitor and recognize environmental sounds.
Keywords/Search Tags:sound recognition, MFSC, CNN, FPGA
PDF Full Text Request
Related items