Font Size: a A A

Research On Lossy Compression Of Floating-points For Neural Networks

Posted on:2022-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z B HuFull Text:PDF
GTID:2518306569997509Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data,artificial intelligence has made great progress,and deep neural networks,because of their powerful performance,have received a lot of attention and research in a large number of practical tasks.However,with the development of neural network technology,the structure of neural network model is becoming more and more complex,and the number of participants is more and more,resulting in the time and resource cost of neural network in the actual operation time is too large.This poses challenges for the training and practical application of neural networks,especially in resource-constrained platforms such as mobile devices and embedded devices.This paper mainly studies the application of floating-point loss compression in the scenario of neural network.In view of the high resource overhead of neural network models in the stages of training and reasoning operation,the following two studies were done:In view of the frequent transmission through the network involved in the neural network model in distributed training and other environments,or the problem that local storage leads to excessive consumption of network resources and storage resources,a lossy differential compression framework for neural networks,Delta-DNN,is proposed.First of all,by observing the floating point between the adjacent versions of a large number of neural network models obtained by observation and training,it is found that it has a higher similarity,inspired by the differential compression technology in the field of data compression,by saving only the difference data between the two adjacent versions(i.e.,the difference between parameters)without the need to save the complete neural network model,using error-controlled lossy compression technology to compress the differential data to obtain a higher compression ratio,while the error is strictly controlled in a certain range The Delta-DNN framework was evaluated in two scenarios,including reducing the network resource overhead of neural networks and reducing the storage overhead of neural networks.For the neural network model in the resource-constrained platform reasoning runtime,the use of running memory is too high,resulting in most of the neural network model directly unable to run the situation,based on floating point loss compression,proposed for the resource-constrained scenario of the neural network reasoning framework-Smart-DNN.By analyzing the calculation method of the neural network model,a complete neural network model is divided into blocks according to its topology,each block is compressed separately using the lossy compression of floating points with error control,so as to obtain better compression effect,and the structure of the neural network itself is guaranteed to remain unchanged.Implement reduced memory resource consumption at runtime.According to the experimental test data,Delta-DNN can effectively increase the compression ratio by 2×-10× without sacrificing the inference accuracy of the neural network and without changing the neural network structure,and the Smart-DNN framework can reduce the memory overhead to 1/5-1/10 without reducing the inference accuracy by no more than 0.2%.
Keywords/Search Tags:Neural Network, Lossy Compression, Float-points, Delta Compression, Partial Decompression
PDF Full Text Request
Related items