Acceleration,Compression And Evaluation Methods On Deep Neural Networks

Posted on:2020-10-14

Degree:Master

Type:Thesis

Country:China

Candidate:Z Chen

Full Text:PDF

GTID:2428330572487269

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,deep neural networks have shown excellent performance on many artificial intelligence issues.At the same time,research in the fields of Internet of Things and edge computing has gradually moved toward application.The combination of deep learning and Internet of Things and edge computing will greatly enhance productivity.However,the deep learning techniques based on deep neural networks put high require-ments on hardware resources and are difficult to deploy on current resource-constrained devices.In order to popularize deep learning applications,it has become a very impor-tant research topic to accelerate and compress deep neural networks and reduce their consumption of resources and time.Therefore,based on the low-rank approximation and channel pruning methods,this thesis studies the acceleration,compression and eval-uation methods of deep neural networks and includes the following work:First,this thesis proposes a mathematical model of the channel pruning process.This mathematical model indicates that the size of the filter is a bottleneck that lim-its the performance of channel pruning,and reducing the filter size can improve the performance of channel pruning.In this thesis,the "decouple-stretch"-based channel pruning method is proposed.The filters are mapped into a network with smaller filters and deeper structure,and then channel pruning is performed.Ablation experiments and hardware simulation experiments verify that the "decouple-stretch" method improves the performance of channel pruning.Then,this thesis proposes a channel pruning method combined with low-rank ap-proximation,which can better eliminate the redundancy in the filter from the mathe-matical point of view.The network is modified with low-rank approximation to obtain the most acceleration gain;then channel pruning is implemented to obtain the most compression gain.In addition,a new rank selection indicator is proposed for better bal-ancing low-rank approximation step and channel pruning step.Ablation experiments and hardware simulation experiments verify that channel pruning method combined with low-rank approximation is better than low-rank approximation or channel pruning alone.Finally,this thesis proposes an evaluator using the integral of fitting curves,which can make a more comprehensive and reasonable evaluation for the efficiency of differ-ent acceleration and compression algorithms.The experiment verified the rationality and necessity of this indicator.In addition,using this indicator to compare the two ac-celeration and compression methods proposed in this thesis with baseline algorithms,the performance improvement of the accelerated compression method is verified.

Keywords/Search Tags:

Deep Learning, Deep Neural Network Acceleration, Channel Pruning, Hardware Resources, Evaluation Method

PDF Full Text Request

Related items

1	The Study Of Pruning Methods Of Deep Neural Network
2	Similarity-Based Approach To Neural Network Pruning
3	Deep Neural Network Acceleration With Sparse Prediction Layers
4	Research On Optimization And Acceleration Methods Of Deep Neural Network Models For Hardware Implementation
5	Research On Compression And Acceleration Of Deep Neural Network Based On Model Pruning
6	Research On The Design Of Convolutional Neural Network Hardware Acceleration System In Deep Learning
7	Study On Acceleration Of Deep Convolutional Neural Network With Pruning
8	Research On Acceleration Method Of Deep Convolutional Neural Network Based On Heterogeneous Computing Platform
9	Research On Lightweight Of Deep Network Based On Channel Pruning
10	Deep Neural Network Compression And Acceleration