Quantization And Acceleration Of Deep Network Parameters

Posted on:2021-10-15

Degree:Master

Type:Thesis

Country:China

Candidate:Q Y Liu

Full Text:PDF

GTID:2518306017474734

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Since the first true deep neural network,AlexNet,won the ILSVRC2012 competition with a huge advantage in 2012,deep learning has received more and more attention from scholars and corporate organizations.In 2006,the American scholar Hinton proposed the concept of deep learning.After that,deep learning has been further developed.Up to now,deep learning has achieved excellent performance on many issues such as computer vision.Large domestic and foreign companies such as Google,Microsoft,and Ali are using them on a large scale.However,because deep learning technology has great requirements on hardware resources,it requires a lot of storage space and computing resources,and it is difficult to deploy on existing resource-constrained devices(such as mobile devices).Therefore,accelerating and compressing deep networks and reducing their consumption of hardware resources and time are very important research directions in the current academic and industrial circles.On this basis,this paper proposes the following methods based on deep network compression and acceleration:1.This paper aims at the problem that the discrete quantization level in the quantization method of neural network compression acceleration is difficult to train and leads to accuracy reduction.The mathematical model of the soft quantization function is improved.The sigmoid function is used to approximate the step function,which reduces the complexity of the function model and makes the original network During the training process,the quantized hierarchical network can be approached indefinitely,which achieves the purpose of training and improves the accuracy.The results after the experiment show that this method has achieved similar accuracy to the latest quantized network.2.Next,this paper further solves the problem that the original network training approaches the quantization network training time.Due to the problem of high variance oscillation when minimizing the loss function,the network training time is lengthy.We accelerate the training of the random gradient descent algorithm by optimizing the training in the relevant direction and weakening the oscillation in the unrelated direction.This method will be used in when solving the gradient of the loss function,an update vector component ? is added to speed up the convergence of the loss function.During the training of the neural network,the speed of the network convergence is accelerated and the probability of the loss function not converging is reduced.3.Finally,this paper further optimizes the time length of the forward propagation of the quantization neural network.For the quantization network,the operation process of the convolution operation is optimized.It is proposed to use bit operations instead of matrix multiplication operations to improve the neural network's forward propagation time.The speed and the experimental results prove the effectiveness of the proposed method.

Keywords/Search Tags:

quantized compression, accelerated training, accelerated forward propagation

PDF Full Text Request

Related items

1	Research On Support Vector Machine Accelerated Training Algorithm
2	The Research Of Accelerated Life Tests Data Management And Analysis System
3	Research Of The Functional Verification Of A Kind Of Accelerated Graphics System-on-a-chip
4	Laser Wake Field Of Accelerated Particle Simulation Of The Electronic Mechanism
5	Silicon Microwave Transistor-based Rapid Aging, Accelerated Life Testing
6	Investigation In Lifetime Assessment Of Electron Multiplier Based On Double-stress Accelerated Degradation Test
7	Research And Application Of Accelerated Rendering Technology For Virtual Chemical Secene
8	Accelerated Reader: Does it work
9	The most effective use of Accelerated Reader for upper elementary students
10	Method And Its Application Of GPU Accelerated LAPW Basis Set First-principles Calculation