Font Size: a A A

Research And Application Of Lightweight Neural Network Based On Knowledge Distillation

Posted on:2022-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:P WangFull Text:PDF
GTID:2518306524493764Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
While improving the performance of depth model,problems such as increasing model parameters,increasing memory resources,too long training time,and too much calculation come one after another.These problems make embedded devices,integrated devices and other resource constrained machines can not be used normally,thus affecting the application and promotion of deep learning.Under this background and market demand,the research on model compression method is of great significance.Knowledge distillation is a widely concerned model lightweight method in recent years.Student model is trained by imitating teacher model.Under the guidance of the teacher network model,the student network model can better learn the structured knowledge in the dataset.Based on this,according to two different task angles,an improved optimization method based on knowledge distillation strategy is proposed.The main contents are as follows:(1)Based on knowledge distillation,a multi block training method is proposed.This paper analyzes the limitations of the current knowledge distillation method in the classification task,and uses the uniqueness of the knowledge distillation model to divide the distillation process into two stages,namely,the teacher network model feature learning stage and the self-learning stage.In the feature learning stage of teacher network model,each layer is not affected,and the loss function is calculated independently;in the self-learning stage,the weight of teacher network model is discarded,and the student network model is used to train directly in the data set.(2)A sample box detection method(GT-KD)based on anchor based on GT mask is proposed.The anchor point and real sample box are used to calculate the correlation,and the feature GT mask is generated from the dimension of network model,so as to avoid complex data preprocessing.The filter threshold of GT mask is adjusted to meet the needs of different training stages and accuracy.Through the transformation of the feature adaptation function,the student network will get the same structure as the teacher network,which is helpful for the subsequent distillation steps.(3)The environment perception system of driverless based on knowledge distillation is designed and implemented.Based on the above research results,GT-KD object detection knowledge distillation strategy is applied to the object detection function in the perception system,which can effectively reduce the memory consumption of industrial computer and improve the running speed.The prediction results are drawn in the video and displayed through the visual interface.The system design and implementation of the hardware monitoring system,system resource usage and function configuration can be viewed and adjusted through the web interface.Experiments on public data sets show that the hierarchical training framework of knowledge distillation features can effectively improve the classification level.In Imagenette dataset,the accuracy of student network model reaches 97.4%,and the gap between student network model and teacher model is reduced to 1.8%.The object detection model obtained by GT-KD method has better effect than the small network training alone,and the accuracy in MS COCO dataset is improved by 3.2%.After applying the GT-KD strategy results to the driverless system,multiple network models can run smoothly with limited hardware resources.
Keywords/Search Tags:Knowledge distillation, Object detection, Image classification, Deep learning
PDF Full Text Request
Related items