Research On Knowledge Distillation Residual Network Based On Focused Attention Mechanism And Generative Adversarial Learning

Posted on:2022-06-06

Degree:Master

Type:Thesis

Country:China

Candidate:H Q Yang

Full Text:PDF

GTID:2507306497472514

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The operation of deep neural network usually depends on high-performance graphics card,large capacity storage,excellent cooling system,etc.The high cost of these devices seriously restricts its further development and promotion.The more excellent the network performance,the greater the number of parameters and model complexity,and the general computing equipment can hardly run.Therefore,the compression and acceleration of deep neural network has become a research hotspot in the industry,but the lightweight design of complex model usually leads to the loss of model accuracy.In the face of the problem that the lightweight of the model leads to the decline of precision,the paper takes the depth residual network Res Net as an example,and systematically studies the lightweight and accuracy improvement of the deep neural network.Based on the current mainstream model lightweight method knowledge distillation as the infrastructure,the paper proposes a new approach,which combines focused attention mechanism(FA),adopts generative adverse learning(GAL),and multi teacher hybrid teaching network based on ensemble learning.The improved network effectively improves the accuracy of image classification task.The main research contents of the paper include:(1)The application of focused attention mechanism in image classification task is studied,and a network structure combining focused attention mechanism and knowledge distillation is proposed.Through the attention scoring mechanism,the distilled deep residual network focuses on the important features related to the target task,and filters the information irrelevant to the results,so as to realize the centralized and efficient utilization of computing resources.The shortcut is introduced to realize the cross layer transmission of weight and score,which avoids the information redundancy caused by long-distance training,and solves the problem that deep neural network may lose the important information in the previous section due to the deep network layers.(2)The paper studies the relationship between the generative adversarial learning and the teaching mode in knowledge distillation,and proposes to transform the teaching mode of teacher student network into competitive learning,and introduce discriminator to distinguish the output results of teacher network and student network.Through constant adjustment of parameters,the gap between student network and teacher network is narrowed,and the interactive learning effect between teachers and students is enhanced.In the teaching process,the teacher network constantly supervises the student network,and trains with the student network synchronously,which ensures the timely updating of the model parameters,improves the discrimination of the student network to deal with similar tasks,and improves the training efficiency.(3)A multi teacher hybrid teaching network training mode based on ensemble learning is proposed.The traditional single teacher network is replaced by multi teacher hybrid teaching network to enhance the teaching breadth and teaching ability of teacher network.In the process of training,the student model can learn from multiple teacher networks,so that the student network can obtain more effective information and enhance the performance of the student network.

Keywords/Search Tags:

Residual network, Focused attention mechanism, Generative adversarial learning, Ensemble learning, Knowledge distillation

PDF Full Text Request

Related items

1	Generative Adversarial Network-based Text Generating Image Research
2	Auto Insurance Fraud Detection Based On Generative Adversarial Network
3	Virtual Image Generation Based On Generative Adversarial Networks
4	Algorithm Design Of Intelligent Marking System Based On Deep Learning
5	Research On Unsupervised Image Generation Based On Generative Adversarial Networks
6	Facial Expression Synthesis Based On Generative Adversarial Network
7	Research On Knowledge Tracing Method Based On Attention Mechanism
8	Ensemble Learning For Dropout Prediction In Moocs
9	Adaptive Multi-Teacher Multi-Student Knowledge Distillation Learning
10	Research On Student Achievement Prediction Model Based On Ensemble Learning Algorithm