Font Size: a A A

Efficient Adaptive Subgradient Methods Based On Random Projection

Posted on:2020-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y WanFull Text:PDF
GTID:2428330575952500Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Online learning is a general framework which enjoys an attractive combination of computational efficiency and theoretical guarantees.It can quickly update param-eters using real-time collected data.With the advent of the era of big data,to deal with large-scale and fast-growing data,it has received ever-increasing attention.Adap-tive subgradient methods,which can dynamically integrate knowledge of the geometry of data observed to guide the direction of updating,are popular for online learning.According to the amount of information used,adaptive subgradient methods can be divided into diagonal-matrix version and full-matrix version.Because the full-matrix version is computationally intractable in high dimensions,the diagonal-matrix version is the most commonly adopted in practice.However,the diagonal-matrix version on-ly maintains the diagonal elements of gradient outer products,which cannot capture the correlation in gradients.When the high-dimensional data is dense and has an ap-proximately low-rank structure,the diagonal version has worse performance than the full-matrix version.This thesis mainly studies how to reduce the complexity of adaptive subgradient methods without affecting their performance,and the main achievements are as follows:1.To reduce the unacceptable complexity of the full-matrix version of adaptive subgradient methods,we propose an efficient adaptive subgradient method based on gradient projection.The main idea of the proposed method is employing random pro-jection to generate a low-rank matrix,which can approximate the outer product matrix of gradients.In subsequent calculations,we can compute the square root and inverse of the outer product matrix of gradients more fast by only maintaining and manipu-lating the low-rank matrix,which makes the proposed method more efficient than the full-matrix version.Experimental results show that the proposed method achieves the performance close to the full-matrix version while significantly reduces the running time.However,we cannot establish a regret bound for the proposed method,due to the occurrence of dependence issues during its parameters updating.2.To circumvent the problem that the efficient adaptive subgradient method based on gradient projection cannot achieve a regret bound,we further propose an efficient adaptive subgradient method based on data projection.Specifically,for the general-ized linear model commonly used in machine learning,we first propose to replace the outer product matrix of gradients with the outer product matrix of data,which sim-ply and elegantly avoids the dependence issues occurred in the previous method.Then,the proposed method employs random proj ection to generate a low-rank matrix that can approximate the outer product matrix of data to accelerate the process of parameters up-dating,which makes the proposed method as efficient as the previous method based on gradient projection.Furthermore,we establish regret bounds for the proposed method by theoretical analysis.Theoretical results reveal that for the generalized linear model,the proposed method can achieve the performance close to the full-matrix version as long as the data is low-rank or approximately low-rank.Experimental results show that the proposed method is similar to the previous method based on gradient proj ection and accelerates the full-matrix version.
Keywords/Search Tags:online learning, adaptive subgradient method, random projection, gradient proj ection, data proj ection
PDF Full Text Request
Related items