Font Size: a A A

The Study Of First Order Stochastic Optimization For Finite Sum Function

Posted on:2020-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:B M TaoFull Text:PDF
GTID:2370330596485360Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Structural risk minimization,represented by support vector machine(SVM),is an important problem in machine learning.This kind of problem has a general structure,which is the target function is a finite sum function plus a regularization term.The first order algorithm based on gradient descent commonly used to solve this kind of problems.It is an important research issue to find an efficient algorithm when the size of finite sum function is large.As an improvement of the stochastic gradient descent algorithm,the adaptive stochastic gradient descent algorithm based on a specific distribution p in the process of iterative sample selection,and an effective stopping criterion is given.The new stopping criterion can stop iteration at an early stage if the sample size of SVM model’s training set is large.When the data points are imbalanced,due to the existence of distribution p,the distribution can adjusted to make the algorithm better adapt to the imbalanced data.For the nonconvex and nonsmooth case,the stochastic gradient descent algorithm with accelerated regularization term based on variance reduction processes in two steps.The first step deals with nonconvex finite sum part.The algorithm can effectively utilize the nonconvex parameters of the objective function,so that the objective function transformed from nonconvex function to strongly convex function in each iteration and then solved.This algorithm can well handle the difference between the original objective function and the new objective function with accelerated regularization term,so that the approximate local minimum solution of the new problem with the accelerated regularization term is still the approximate local minimum of the original problem.The second step is use the proximal gradient operator to handle the nonsmooth part.Based on the upper bound of proposed algorithm’s gradient estimator,we give the number of iterations in order to get the desired solution satisfying E[||Fη(x)||2]≤ε.This lead to the computational complexity is O(n+n/ε).Numeral experiment valid the effective of the proposed algorithm.
Keywords/Search Tags:Machine learning, Support vector machine, Nonconvex, Variance reduction, Momentum acceleration
PDF Full Text Request
Related items