Font Size: a A A

Design Of Energy-Efficient RNN Accelerator Based On Network Compression And Voltage-Precision Scaling

Posted on:2019-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:T T XuFull Text:PDF
GTID:2428330596460773Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
With the further research of artificial neural networks,deep learning has provided convenience for many aspects of modern society.Deep learning systems based on artificial neural networks are almost everywhere in modern life,for example,identifying objects in an image,converting speech into texts,matching news items,posts,or products with user interests.All of these intelligentize people's life.Particularly,systems based on recurrent neural networks have advantages over other neural networks when processing sequence signals such as speech recognition as the fact that they can transfer information in both space and time.However,the models of recurrent neural networks are getting larger for higher prediction accuracy.Such large models are both computation and memory intensive.Deploying these models will result in extremely-high power consumption,severely affecting the energy efficiency of recurrent neural network accelerators.In this paper,the recurrent neural network algorithm is used as the research basis.Recurrent neural network hardware accelerator with high energy efficiency is the research target.The following works are carried out from two aspects of algorithm scheduling and hardware architecture to improve the energy efficiency of the accelerator.Firstly,the computation flow and key operations of the recurrent neural network algorithm are analyzed.Secondly,the precision adaptive network parameter compression method based on pruning and hybrid quantization is implemented to meet the requirements of different networks and different application scenarios.The corresponding parameter storage and scheduling scheme is also designed,which can not only reduce the amount of computation and storage overhead,but also improve the data reuse rate.Thirdly,a precision adaptive approximate multiplier and a computing array are designed as multiplication consumes a large amount of energy.Moreover,a voltage-precision scaling method for hybrid precision system is proposed and a novel reference voltage generation circuit is designed.When the bit width of the operands changes and the delay of the computing unit changes,an appropriate reference voltage is provided to minimize the waste of the timing margin,thereby saving the power consumption and achieving the goal of high energy efficiency.The experimental results show that in the TSMC 45 nm process and at the frequency of 200 MHz,the accelerator designed in this paper can achieve 4-16 bit computing for different recurrent neural network models.When the input data is 16 bit,the circuit works under 1.1V voltage,the peak performance is 102.4GOPs,the power consumption is 166.8mW,and the energy efficiency is 0.6TOPs/W;When the input data is 4bit,the circuit works under 0.8V voltage,the peak performance is unchanged,the power consumption is only 38.4mW,and the energy efficiency is 2.7 TOPs/W.Actually,the energy efficiency of our accelerator is more than 2.5 times that of other recurrent neural network accelerators.
Keywords/Search Tags:Recurrent Neural Networks, Accelerator, Network Compression, Approximate Computing, Voltage-Precision Scaling
PDF Full Text Request
Related items