Font Size: a A A

Quantization And Acceleration Of Deep Network Parameters

Posted on:2021-10-15Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y LiuFull Text:PDF
GTID:2518306017474734Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Since the first true deep neural network,AlexNet,won the ILSVRC2012 competition with a huge advantage in 2012,deep learning has received more and more attention from scholars and corporate organizations.In 2006,the American scholar Hinton proposed the concept of deep learning.After that,deep learning has been further developed.Up to now,deep learning has achieved excellent performance on many issues such as computer vision.Large domestic and foreign companies such as Google,Microsoft,and Ali are using them on a large scale.However,because deep learning technology has great requirements on hardware resources,it requires a lot of storage space and computing resources,and it is difficult to deploy on existing resource-constrained devices(such as mobile devices).Therefore,accelerating and compressing deep networks and reducing their consumption of hardware resources and time are very important research directions in the current academic and industrial circles.On this basis,this paper proposes the following methods based on deep network compression and acceleration:1.This paper aims at the problem that the discrete quantization level in the quantization method of neural network compression acceleration is difficult to train and leads to accuracy reduction.The mathematical model of the soft quantization function is improved.The sigmoid function is used to approximate the step function,which reduces the complexity of the function model and makes the original network During the training process,the quantized hierarchical network can be approached indefinitely,which achieves the purpose of training and improves the accuracy.The results after the experiment show that this method has achieved similar accuracy to the latest quantized network.2.Next,this paper further solves the problem that the original network training approaches the quantization network training time.Due to the problem of high variance oscillation when minimizing the loss function,the network training time is lengthy.We accelerate the training of the random gradient descent algorithm by optimizing the training in the relevant direction and weakening the oscillation in the unrelated direction.This method will be used in when solving the gradient of the loss function,an update vector component ? is added to speed up the convergence of the loss function.During the training of the neural network,the speed of the network convergence is accelerated and the probability of the loss function not converging is reduced.3.Finally,this paper further optimizes the time length of the forward propagation of the quantization neural network.For the quantization network,the operation process of the convolution operation is optimized.It is proposed to use bit operations instead of matrix multiplication operations to improve the neural network's forward propagation time.The speed and the experimental results prove the effectiveness of the proposed method.
Keywords/Search Tags:quantized compression, accelerated training, accelerated forward propagation
PDF Full Text Request
Related items