Font Size: a A A

Research On Model Reconstruction And Training Method Based On Increasing Capacity

Posted on:2024-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y KangFull Text:PDF
GTID:2568307106967679Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the field of deep learning,convolutional neural networks have also entered people’s field of vision.Convolutional neural networks usually perform better when labeled training data is sufficient.However,preparing a large amount of annotated data for new tasks or additional categories to the original task requires human involvement,and this process incurs enormous costs in terms of manpower or financial resources,which may be unaffordable for some research groups or researchers.Therefore,deep learning methods based on a small number of samples have important practical significance.The current deep network training on a small sample data set can use the transfer learning method to fine-tune the target data set.This strategy is also called pre-training-fine-tuning.Under this framework,the classification layer at the back end of the network is usually reconstructed according to specific tasks,and a network with better performance can be obtained by training on a small number of samples.There are many research directions for pre-training-fine-tuning improvement.This paper mainly studies model reconstruction and training methods that increase network capacity to improve the performance of transfer learning.The specific work is as follows:(1)A transfer learning method is proposed in this work,which combines model capacity expansion and reinitialization.This method primarily addresses the issue of poor performance in fine-tuning of transfer learning.Specifically,the model capacity expansion strategy targets the Res Net network and reconstructs the Res Block by widening or deepening the convolutional layers to increase the model capacity.Reinitialization is employed to address the problem of rapid convergence of the expanded network layers.By periodically reinitializing the newly added network layers,more loss values are available for training,enabling deeper updates of the network parameters and further improving the fine-tuning performance.Experimental results using the CUB_200_2011 and Food-101-50 datasets validate the effectiveness of the proposed method.The results show that the accuracy of the widening part method of reinitializing the widening convolution layer is up to 2.3% compared with the traditional fine-tuning method.(2)This paper proposes a semi-supervised training method for teacher-student networks based on structural reparameterization.To address the issues of excessive parameterization and memory consumption in traditional teacher-student model semisupervised framework,this study leverages the features of reparameterization to propose a nested teacher-student model framework.The backbone network and reconstruction network in this framework correspond to the teacher network and student network,respectively.Among them,the reconstructed network adds a branch structure to the backbone network during training,and the newly added branch parameters are fused into the backbone network through reparameterization after training.Compared with traditional independent teacher-student model,the proposed nested framework only requires loading one student network,significantly reducing the model parameters and memory requirements.Experimental results demonstrate that the nested framework can achieve similar performance to the independent teacher-student network framework.
Keywords/Search Tags:Res Net, Increased model capacity, Semi-supervised learning, Teacher-student model
PDF Full Text Request
Related items