Bit-Quantization Based Method For Speeding Up Deep Neural Networks

Posted on:2018-10-25

Degree:Master

Type:Thesis

Country:China

Candidate:S Mu

Full Text:PDF

GTID:2348330536977912

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Since 2012,the first in the true sense of deep neural network,Alex Net,won the final ImageNet,more and more people began to study and focus on the deep learning field.The essence of deep convolutional neural network is learning different types of feature expression of the external input signal through multiple convolution and pool operations,and the learned feature vector as input of the final nonlinear classifier for making predictions.The advantage of the deep neural network is that it can learn the better feature expression in an unsupervised way,instead of artificial feature extraction.At present,the deep neural network has made remarkable achievements in the field of computer graphics,Natural Language Processing,text analysis,audio processing,information retrieval and other fields.However there are some defects and limitations of deep neural networks in practical applications.For example,the huge computational requirements and the memory space consumption caused by great amount of parameters of the deep neural networks.These problems can be solved externally while the computing resources and hardware equipments are sufficient.However,in some hardware-limited device(mobile devices),the feasibility of the naive deep neural networks will be significantly reduced.In this paper,we propose a neural network to accelerate the computing process of neural networks,and compress the size of the model according to the discreteness of quantization parameters.Our innovations and works in this paper are stated as follow:1)We deduce and analyze the mathematical process of forward and backward propagation in deep neural networks,and prove that the use of bit operation instead of multiplication can accelerate deep neural networks efficiently.2)This paper proposes an innovative bit quantization of deep neural networks,which breaks the traditional concept of convolution operation.3)Through analysis of characteristics of bit quantization parameters,we make use of the discretization of quantized parameters to compress the model.The storage space of the BQ-Net model is compressed to about 10% of the plain model,and it saves 50%~75% memory space of network in running time while achieving state-of-the-art results.4)Finally,the experimental data show that the BQ-Net successfully achieves the goal of accelerating and compressing deep neural network while guaranteeing the performance.The BQ-Net is able to obtain the similar accuracy of full-precision neural networks.In general,this paper analyzes and optimizes the traditional deep neural network propagation process,and proves the effectiveness of the proposed BQ-Net through theoretical deduction and experiments...

Keywords/Search Tags:

Deep Learning, Deep Neural Networks, Model Compression, Model Acceleration

PDF Full Text Request

Related items

1	Simplification Of Deep Models:Storage Compression And Computational Acceleration
2	Bit-Quantization Based Method For Speeding Up Deep Neural Networks
3	Acceleration,Compression And Evaluation Methods On Deep Neural Networks
4	Research On Model Compression And Acceleration Based On Network Growth Method
5	Research On Deep Neural Networks Compression And Acceleration
6	Research On Model Simplification Based On High-Precision Deep Learning
7	Deep Neural Networks Compression And Acceleration Based On Interpretable Analysis
8	Research On Compression And Acceleration Of Deep Neural Network Based On Model Pruning
9	Research And Application Of Deep Model Compression Technology
10	Development Of Model Compression And Inference Acceleration Algorithms Of Image Super Resolution Deep Neural Networks