Font Size: a A A

Implementation Of Embedded System For Speech Keyword Spotting Based On Neural Network

Posted on:2022-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y T PuFull Text:PDF
GTID:2518306602494584Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Voice interaction is an important way of communication between the human user and the machine.With the wide application of home Io T and mobile devices and other low-power Io T machines,how to use keyword spotting(KWS)to ensure the reliability of humancomputer interaction has become a research hotspot.In recent years,with the continuous development of deep learning,the relevant technology based on Convolutional Neural Network(CNN)has shown its excellent performance in the field of KWS.Thanks to CNN's effective extraction of local features and good spatial invariance,speech keywords can be recognized efficiently and accurately.However,due to the large number of parameters of CNN network and the huge computational amount of convolution operation,the application of CNN network in embedded devices with limited memory and computing resources is challenged.In order to solve the problem of excessive computing and storage resources caused by the deployment of convolutional neural network on embedded devices,this thesis studies the compression and acceleration algorithms for the convolutional neural network model based on the analysis of CNN's parameter storage structure and the convolutional computing model.In the aspect of network model compression algorithm,the neural network quantization technology is used.According to the sensitivity of network parameters to the data bit width,the quantization strategy of different data bit width is used.In addition,a progressive quantization strategy is used to optimize the quantization process and compensate for the loss of model precision caused by the quantization process through repeated retraining.In the aspect of network model acceleration algorithm,the network pruning technology is used to improve the channel selection strategy in pruning.Based on comprehensive consideration of the convolution kernel before and after the convolution kernel,the method of minimizing the characteristic error index is adopted,and the greedy algorithm is used to solve the redundant channels of the convolution kernel.Finally,a keyword spotting system based on compression and acceleration of the proposed algorithm is deployed on ESP32,an embedded module with low cost and low power consumption,and an experiment on the real machine is carried out.Experimental results show that the progressive quantization strategy can control the reduction of precision within 5%,and the model precision of the proposed algorithm is improved by about 10% compared with similar algorithms.In the aspect of network model acceleration,the proposed algorithm is about 20% better than the traditional pruning algorithm of classical filter network.Through the network compression and acceleration,the CNN network with the same structure can compress about 54 times the memory volume and3.3 times the computation amount,and the overall model accuracy loss is not more than 10%.In the experiment on ESP32,the speech keyword spotting system can perform speech keyword recognition operation in an average of 400 ms.
Keywords/Search Tags:Keyword recognition, Embedded system, CNN, Network quantification, Network pruning
PDF Full Text Request
Related items