Font Size: a A A

An Optimization Method For Solving Large-scale Machine Learning Problems

Posted on:2019-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhouFull Text:PDF
GTID:2438330566973215Subject:Mathematics
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology,data acquisition becomes more and more various and convenient.There are a large number of high-dimensional data produced everyday in all walk of life.However,these data often contain a lot of miscellaneous information.Therefore,how to deal with these data and classify them effectively is more important.Meanwhile,how remove the redundancy,and further extract useful information from these data becomes the most basic and important problem in machine learning.In machine learning,many problems are solved by solving large-scale optimization problems.Based on the hot topic of machine learning,this paper studies the machine learning algorithm whose framework is "regularization term +loss function".We divide above problem into two categories,which is smooth empirical loss problem and structural risk minimization problem.In this paper,we mainly use the stochastic optimization method to solve these two kinds of problems.The detailed arrangement is as follows:For smooth empirical loss function problem,we propose our algorithm based on the existing stochastic gradient method.Because of the bad effect from the noise gradient estimation,the stochastic gradient method can not converge to the optimal solution at the fixed step size.Even if the decreasing step size is taken,it can only satisfy the slow,sublinear rate of convergence.There are two methods we explore.Firstly,we propose a noise reduction method for stochastic gradient method,which is called stochastically controlled stochastic gradient.This method uses an automatic parameter adjusting system on the basis of SVRG noise reduction method.The convergence of this method has the same rate as SVRG method,but the computational complexity of our method is much lower than that of SVRG method.Besides,compered with SVRG method,the required storage space is also greatly reduced.Secondly,we explore a variant of stochastic L-BFGS method.It's used to process the nonlinear and ill-conditioned objective optimization with approximate second order derivative information,which is based on the stochastic Newton method.The approximate Hessian matrix is modified according to the idea of quasi-Newton method.Then,the stochasticquasi-Newton method is obtained.The modified stochastic L-BFGS method combine the idea of limited memory in large-scale optimization with the noise reduction method in stochastic gradient.Theoretical analysis shows that our algorithm arrives at linear convergence rate.For structural risk minimization problem,we consider the regularization term as an independent entity.When loss function is smooth but regularization term is non-smooth,we propose a proximal stochastic L-BFGS algorithm.The problem is divided into two parts: smooth and non-smooth part.We use a stochastic L-BFGS method to process smooth loss term,and introduce a proximal operator into non-smooth term.Then,we carry out the stochastic iteration and proximal iteration alternately.Finally,we make a theoretical analysis,which shows that our algorithm satisfies the linear convergence rate.When the loss function is not smooth,we propose a stochastic subgradient mirror descent algorithm with the weighted average.The main innovation of this method is the selection of average weight,which is generated by iterative of our algorithm,instead of artificial setting.So our method is adaptive.We also analyze the convergence of the algorithm theoretically.
Keywords/Search Tags:Stochastic Optimization, L-BFGS method, Structural Risk Minimization, Proximal Operator, Weighted Average, Mirror Descent
PDF Full Text Request
Related items