Font Size: a A A

Optimization Methods Of Stochastic Computing Systems And Its Applications On Deep Learning

Posted on:2021-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:2518306503474644Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
Complex network structures in deep learning require large-scale computing resources.In mobile devices and embedded systems with limited area and power consumption resources,the size of neural networks is restricted.Stochastic computing,as an unconventional computation,has the advantages of low hardware cost,fast computation,and high fault tolerance.Different from the traditional binary calculation,the stochastic computing uses a stochastic bit stream,which represents the target value with the probability of the number 1 in the bit stream.This encoding method enables important arithmetic operations such as multiplication and addition to be implemented with simple logical operations.Therefore,in recent years,stochastic computing has been increasingly applied and researched in the field of deep learning.However,stochastic computing still faces some challenges.(1)The total delay of generating stochastic numbers is equal to the product of the circuit delay and the length of the stochastic bit stream.With the improvement of calculation accuracy,the length of the bit stream increases exponentially.Therefore,under high-precision calculations,there is a problem of high delay in the process of generating stochastic numbers.(2)In the deep learning network based on stochastic computing,the random fluctuations of the bit stream affects the accuracy of matrix multiplication and activation function operation results,and at the same time,the nonlinear activation function operation can only be realized by approximate estimation of the circuit.This thesis proposes the idea of parallel architecture to reduce the delay of generating stochastic numbers.The overhead area of parallel circuits can be reduced by sharing the random number source and parts of the weighted binary sequence generator.That is,the same random number source is used to generate random numbers with different inputs.And the circuits with the same input and output are merged in the weighted binary sequence generator.Experimental results show that our work improve the accuracy of multiplication operations,and reduces the area delay product by 58.5 %.This thesis uses an approximately parallel counter as a De-randomizer to implement the function of converting stochastic bit streams into binary number streams.This parallel counter can improve the hardware cost of the circuit.At the same time,this thesis uses the reconfigurable architecture based on stochastic logic(Re SC)and Bernstein polynomial approximation to implement the activation function operation.Experimental results show that compared with neural networks based on finite state machines,the accuracy of the above network is better,and is closer to the traditional floating-point neural network.After optimizing the stochastic number generator circuits,the area delay product of the network circuit is reduced by 67.9 %.
Keywords/Search Tags:Stochastic computing, Parallel implementation, Area-delay product, ReSC
PDF Full Text Request
Related items