Research On Stage-by-Stage Knowledge Distillation And Assistant Model Based Knowledge Distillation

Posted on:2021-12-06

Degree:Master

Type:Thesis

Country:China

Candidate:M Y Gao

Full Text:PDF

GTID:2518306548481404

Subject:Software engineering

Abstract/Summary:

In the recent years,deep convolutional neural networks have achieved great success in the field of image classification,object detection,semantic segmentation,etc.However,the advantage of CNN is accompanied with deep model structure,which requires extensive computing resources and memory cost,hindering it from being applied to real production.To this end,it becomes crucial to explore a way of reducing the model size but barely sacrificing the performance.In this paper,we work on a model acceleration technique called knowledge distillation and proposed two methods to improve its performance.The proposed methods achieved state-of-the-art results.The key idea for knowledge distillation is to transfer the knowledge from a deep teacher model to a shallower student model.Benefit from the transferred knowledge,the performance of student can be improved and become close to teacher.If the performance of student become exactly as the teacher’s,we can consider that teacher has been compressed into a light weight student model.In this paper,we claimed that it is important to transfer the feature knowledge at down-sampling point in a network.Meanwhile,we proposed to decompose the transfer process into two steps:backbone learning and task-head fine-tuning.Then,a stage-by-stage knowledge distillation will be applied,which facilitates progressive feature learning from teacher to student.Considering there still have gap between student and teacher network,we introduce an assistant model to reduce this gap.Specifically,student is trained to mimic the hidden feature maps of teacher,and assistant aids this process by learning the residual error between them.In this way,student and assistant complement with each other to get better knowledge from teacher.

Keywords/Search Tags:

model compression, knowledge distillation, residual error network

Related items

1	Research And Implementation Of Compression Algorithm For Multilevel Knowledge Distillation Model Based On Feature Map
2	Exploring Knowledge Distillation And Dynamic Network In Deep Model Compression
3	Design And Implementation Of Compression And Acceleration System Of Lip Reading Model Based On Knowledge Distillation
4	Research And Implementation Of Model Compression Method Based On Knowledge Distillation
5	Research On Model Compression Algorithm Based On Knowledge Distillation And Reinforcement Learning
6	Research On Convolutional Neural Network Compression Algorithm And Application Based On Knowledge Distillation
7	Research And Application Of Model Compression Algorithm Based On Pruning-quantization-knowledge Distillation
8	Research On Compression Method Of Multimodal Pretrained Model Based On Knowledge Distillation
9	Research On Compression Approach Of Network Of General Object Detection Based On Knowledge Distillation
10	Research On Model Distillation Method Based On Channel Knowledge