Font Size: a A A

Research On Quantization Methods Of Weight And Gate Parameters In Lstmneural Network Model

Posted on:2020-08-11Degree:MasterType:Thesis
Country:ChinaCandidate:K P LiFull Text:PDF
GTID:2428330596993853Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Artificial Intelligence,the application scenarios of DNN(Deep Neural Network)is becoming more and more complex,and the scale of corresponding networks is expanding rapidly.How to use a limited computing resource to implement a more complex DNN model,or how to achieve greater data processing throughput under the same computing resource overhead,has become an increasingly important and significant topic in the research of DNN models.The research on Quantization Methods of RNN(Recurrent Neural Network)models,which plays an important role in the field of NLP(Natural Language Processing),is scarce in current researches.Among the many variants of RNN models,LSTM(Long Short-Term Memory)model has been used most widely.Therefore,this research applied first-order residual quantization method or Gumbel Softmax method to the Bi-LSTMCRF(Bidirectional LSTM with Conditional Random Field as classifier)model,which is an important representative of LSTM models.Apart from this,a new method for learning the quantization error of the gate parameters by using the weight parameters is proposed in this paper.In Chapter 2 of this paper,5 kinds of RNN models were compared and analyzed.And the Bi-LSTM-CRF model,which has the best performance in the above,was selected as the benchmark model for the experiments in the subsequent chapters.In Chapter 3,the first-order residual quantization method in convolutional neural networks was extended to quantify parameters of LSTM,such as weight parameters.Experiments showed that this method can effectively reduce the performance loss caused by quantization on LSTM parameters.In Chapter 4,the experiment results show that the Gumbel Softmax method had limitations in the application.Based on this,the new quantization method proposed in this paper was applied in the experiments.The method does lossy quantization on gate parameters during iterations,and the weight parameters learn to offset the loss of quantization on gate parameters by adjusting the gradient of back propagation during weight parameters optimization.On the Named Entity Recognition dataset,the F1 score of the model with the new quantization method on gate parameters,decreased by only 0.7% compared to the baseline model.Further,in the same data set experiment,the F1 score of the model combining the weight parameter quantization method and the new gate parameters quantization method,only decreased by 0.3% compared to the baseline model.At the algorithm level,the experimental results in this paper have proved the effectiveness of the quantization methods mentioned above,which the quantization methods in this paper are all selected for the hardware-oriented design.In the weight parameters quantization,each element in the weight matrix can be quantized to ±1,so it can be represented by 1 bit in the hardware design.Correspondingly,after quantization,the memory consumption of the LSTM weight parameters is reduced by 96.87%.In the gate parameters quantization,the gate parameters can be quantized to three values,such as 0,0.5,1.After the quantification above,a large number of multiplication calculations in the LSTM model can be transformed into addition operations and shift operations.Therefore,the results of this research can be extended to the high-performance hardware design of neural network model in the subsequent research.
Keywords/Search Tags:Artificial Intelligence, Natural Language Processing, Long Short-Term Memory, Recurrent Neural Networks, Quantization Method
PDF Full Text Request
Related items