Font Size: a A A

Acceleration,Compression And Evaluation Methods On Deep Neural Networks

Posted on:2020-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z ChenFull Text:PDF
GTID:2428330572487269Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,deep neural networks have shown excellent performance on many artificial intelligence issues.At the same time,research in the fields of Internet of Things and edge computing has gradually moved toward application.The combination of deep learning and Internet of Things and edge computing will greatly enhance productivity.However,the deep learning techniques based on deep neural networks put high require-ments on hardware resources and are difficult to deploy on current resource-constrained devices.In order to popularize deep learning applications,it has become a very impor-tant research topic to accelerate and compress deep neural networks and reduce their consumption of resources and time.Therefore,based on the low-rank approximation and channel pruning methods,this thesis studies the acceleration,compression and eval-uation methods of deep neural networks and includes the following work:First,this thesis proposes a mathematical model of the channel pruning process.This mathematical model indicates that the size of the filter is a bottleneck that lim-its the performance of channel pruning,and reducing the filter size can improve the performance of channel pruning.In this thesis,the "decouple-stretch"-based channel pruning method is proposed.The filters are mapped into a network with smaller filters and deeper structure,and then channel pruning is performed.Ablation experiments and hardware simulation experiments verify that the "decouple-stretch" method improves the performance of channel pruning.Then,this thesis proposes a channel pruning method combined with low-rank ap-proximation,which can better eliminate the redundancy in the filter from the mathe-matical point of view.The network is modified with low-rank approximation to obtain the most acceleration gain;then channel pruning is implemented to obtain the most compression gain.In addition,a new rank selection indicator is proposed for better bal-ancing low-rank approximation step and channel pruning step.Ablation experiments and hardware simulation experiments verify that channel pruning method combined with low-rank approximation is better than low-rank approximation or channel pruning alone.Finally,this thesis proposes an evaluator using the integral of fitting curves,which can make a more comprehensive and reasonable evaluation for the efficiency of differ-ent acceleration and compression algorithms.The experiment verified the rationality and necessity of this indicator.In addition,using this indicator to compare the two ac-celeration and compression methods proposed in this thesis with baseline algorithms,the performance improvement of the accelerated compression method is verified.
Keywords/Search Tags:Deep Learning, Deep Neural Network Acceleration, Channel Pruning, Hardware Resources, Evaluation Method
PDF Full Text Request
Related items