Research On Lightweight And Reliability Of Convolutional Neural Network Edge Computing For Non-volatile Memory

Posted on:2024-04-12

Degree:Master

Type:Thesis

Country:China

Candidate:Z Q Wang

Full Text:PDF

GTID:2558307157973849

Subject:Software engineering

Abstract/Summary:

Combining with variant computer vision scenarios,a growing number of convolutional neural network models have been deployed in edge devices to meet such application requirements.However,such systems are suffering from their limited computational and storage resources.The emerging memory materials called non-volatile memory(NVM)appears to be the promising solution to break the bottleneck with its high storage density,good scalability and low static power consumption.As a result,in this paper,the Non-Volatile Memory is employed as a Hardware Accelerator for neural network structures known as EBNA and the accuracy of these models and corresponding computation efficiency are analyzed by performing operations such as quantization,pruning,and defensive distillation at the network structure level.The main contribution of this work are shown below.(1)Quantization techniques are well explored to shrink the size of convolutional neural network models.On the one hand,the difference among zero-sample quantization,diverse sample generation,loss minimization without data and data-free model compression methods are analysed and applied to EBNA.On the other hand,a binary complementary quantization technique is proposed based on data-aware quantization techniques.Furthermore,a straightthrough estimator is introduced to avoid the occurrence of zero-gradient in the quantization process for EBNA.In the study,the model top1 correct rate increased from 91.99% to 92.15%,91.99% to 91.97% and 91.99% to 91.67% with the weight quantization bits of 8,6 and 4respectively.(2)A pruning approach based on a dynamic fixed threshold percentage is proposed to compress the volume of convolutional neural network models.The method combines a cyclic pruning connection and fine-tuning process to ensure the best performance of models.The compression ratio is controlled by dynamically adjusting the pruning threshold during the finetuning process.Experimental results show that the model volume was reduced from 60.44 MB to 6.81 MB,and the model accuracy was improved from 91.99% to 92.19% by using this paper’s dynamic fixed threshold percentage threshold selection method.(3)Protection concerns are analysed to enhance the reliability of convolutional neural network models.And such concern is addressed by improving the defensive distillation technique as a protection mechanism.By reducing sensitivity to input perturbations,the improved defensive distillation technique generates smooth classifier models and reduces the success rate of adversarial sample generation.In the case study,the distillation model increased the classification accuracy from 91.99% to 92.11% and reduced the success rate of generating adversarial samples from 88.12% to 7.23% compared to the original model.(4)A framework for defensive compression models is proposed to meet the requirements of small size,high accuracy and reliability of convolutional neural network models on EBNA.The framework combines quantization and pruning compression approaches,uses clustering methods to classify multi-layer model weights,introduces weight sharing techniques,and is integrated with improved defensive distillation techniques.In this case,experiments were conducted on the CIFAR10 dataset using the VGG16 model of the resistant compression model framework.The experimental results show that the framework can compress the model volume from 61.82 MB to 2.13 MB,the classification accuracy of the model on the CIFAR10 dataset increases from 92.11% to 93.73%,and the success rate of generating adversarial samples decreases from 88.12% to 7.23%.In conclusion,this study aimed to meet the requirements of convolutional neural network models on EBNA by applying quantization,pruning,and defensive distillation techniques at the network structure level.

Keywords/Search Tags:

Edge Computing, Non-Volatile Memory, Convonlutional Neural Networks, Quantization, Pruning

Related items

1	Research On High-performance And Low-power Edge Computing Based On Non-volatile Memory
2	Research On Computing-in-Memory Circuit And System For Edge Neural Network Accelerator
3	Research On Convolutional Neural Network Compression Strategy For Edge Computing
4	Research On Time-Domain Based Intelligent Edge Processing Core With Non-Volatile Memory
5	Research On Edge Computing Deep Neural Network Optimization Technology Based On Channel Pruning
6	Memristive Neural Networks: Co-design Of Devices And Algorithms
7	Research On Key Processing-in-memory Technologies With High-performance And Low-power For Deep Learning On Edge Devices
8	Research On Key Technologies Of Memory Architecture For In-memory Computing
9	Non-volatile Memory Device Based Neural Network Accelerator Design
10	Research On The Application Of Memristor With Both Volatile And Non-volatile In Neuromorphic Computing