Font Size: a A A

Research On Gradient Algorithm Based On Variance Reduction Technique

Posted on:2024-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y J SunFull Text:PDF
GTID:2568307124992849Subject:Statistics
Abstract/Summary:PDF Full Text Request
A recurring issue with broad interdisciplinary interest is the creation of more efficient algorithms for deep model training.The SSAG algorithm is a new effective method for training deep models proposed in recent years.Its core idea is to perform stratified sampling according to label signals instead of traditional simple random sampling to more effectively control gradient variance,thereby improving the algorithm’s performance.This thesis further deploys this core strategy to other more popular algorithms in deep learning.It proposes two new algorithms,adaptive stochastic hierarchical gradient algorithm(SAdam)and hierarchical random variance reduction gradient algorithm(SSVRG),which theoretically proves that both proposed algorithms have linear convergence.Experimental results verify the expected performance of the algorithm.The main work and results achieved in this thesis are as follows:Firstly,based on reviewing the gradient algorithm of the hierarchical sampling variance reduction technique,the SAdam algorithm is proposed.It is proved that the algorithm has a linear convergence rate under the assumption of smooth and strong convexity,which is higher than the sublinear convergence rate of most mainstream algorithms.This thesis compares the performance differences between the SAdam algorithm and SSAG,Adam and Ada Grad algorithms on the MNIST dataset,unbalanced MNIST dataset,breast cancer dataset,and CIFAR-10 dataset,and the results show that the SAdam algorithm has apparent advantages over these algorithms,especially in the two types of datasets with unbalanced category labels and minor intra-class differences between classes.Secondly,based on the review of the existing variance reduction gradient algorithm,the SSVRG algorithm is proposed,which proves that the algorithm has a linear convergence rate under the assumption of smooth and strong convexity.This thesis compares the performance difference between the SSVRG,SGD,SVRG,and BSUG algorithms on the MNIST and Fashion-MNIST datasets.The results show that SSVRG is better than these algorithms and suitable for processing large-scale data.
Keywords/Search Tags:Gradient descent, Unbalanced data, Variance reduction, Stratified sampling, Control variable method
PDF Full Text Request
Related items