Font Size: a A A

Research On The Lottery Ticket Hypothesis For Image Classification Tasks

Posted on:2023-11-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y ZhangFull Text:PDF
GTID:2568306902457874Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The lottery ticket hypothesis(LTH)reveals the existence of winning tickets(sparse but critical subnetworks)for randomly-initialized dense networks,that can be trained in isolation from random initialization to match the latter’s performance.Previous studies narrowly refer to network "performance" as its test set accuracy.However,accuracy is far from the only evaluation metric,and perhaps not always the most important one.Hence it might be myopic to conclude that winning tickets can replace their dense counterpart,even if the accuracy is preserved.In addition,finding winning tickets requires burdensome computations in the train-prune-retrain process,which restricts their practical benefits.Spurred by these,we perform a comprehensive assessment of winning tickets from diverse aspects beyond test accuracy and also propose a new algorithm to efficiently identify the lottery tickets.The main work and novelty are summarized as follows:Firstly,we conduct an exhaustive investigation of the lottery tickets’ characteristics from four different perspectives,including(ⅰ)generalization to distribution shifts,(ⅱ)prediction uncertainty,(ⅲ)interpretability,and(ⅳ)geometry of loss landscapes.Specifically,we consider both adversarial and natural perturbations to assess networks’generalization.For prediction uncertainty,two typical metrics,static calibration error and negative log-likelihood error are used in our work.Also,we explore the interpretability of networks from both the macro and micro perspectives and analyze the flatness of the loss landscape based on the eigenvalues and traces of its Hessian matrix.With extensive experiments across various datasets,(CIFAR-10,CIFAR-100,and ImageNet)and network architectures,we find that an appropriate sparsity can yield winning tickets to perform comparably or even better in all the above four aspects.Moreover,we propose a data and model co-design framework to find lottery tickets more efficiently,by doing so only with a specially selected subset of data,called PruningAware Critical set(PrAC set),rather than using the full training set.The concept of the PrAC set was inspired by recent observations,that deep networks have samples that are either hard to memorize during training,or easy to forget during pruning.A PrAC set is thus hypothesized to capture those most challenging and informative examples for the dense model.We observe that a high-quality winning ticket can be found by training and pruning the dense network on the very compact PrAC set,which can substantially save training iterations for the ticket finding process.Extensive experiments validate our proposal across diverse datasets and network architectures.Our study can offer in-depth insights of the LTH,promote the its practical benefits by finding winning tickets more efficiently,and serve as an important reference for researchers and engineers who seek to incorporate winning tickets for user-facing deployments.
Keywords/Search Tags:Lottery Ticket Hypothesis, Network pruning, Image classification, Model compression
PDF Full Text Request
Related items