Font Size: a A A

Research On Sparse Algorithms And Non-convex Optimization Problems In Machine Learning

Posted on:2020-05-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:L ChenFull Text:PDF
GTID:1368330602950280Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
With the development of data acquisition and storage technology,a large number of data are generated from finance,medicine,network and other fields every day.How to design a fast and effective algorithm to mine valuable information from data has become an urgent problem to be solved in big data.Sparse learning is an important way to deal with it.For large-scale datasets,existing kernel based models use all of samples to calculate kernel matrixes,which consumes much more memory and time,and will further lead to the solution non-sparse and cause algorithms difficult to deal with big data.For high-dimensional dataset,there are redundant features among the features.The existing random projection methods for dimension reduction are fast and effective.However,due to complete randomness,non-zero elements in such matrices are non-uniformly distributed among rows,which leads to more data information loss after dimension reduction.Besides,there are many nonconvex optimization models in machine learning.How to design efficient algorithms for these models to quickly find the “global”optimal solution is another topic worthy of study.This dissertation focuses on the sparse learning algorithm for big data and the non-convex optimization problem in machine learning,mainly including the following parts:(1)To solve the problem that the solution of robust least squares support vector machine(R-LSSVM)losses sparse,and it is difficult to deal with large scale datasets,sparse RLSSVM algorithm(SR-LSSVM)is proposed.First of all,we interpret the robustness of the R-LSSVM from a re-weighted viewpoint and then develop a primal R-LSSVM using the representer theorem.The new model may have a sparse solution.Then,we design a convergent sparse R-LSSVM(SR-LSSVM)algorithm to achieve a sparse solution of the primal R-LSSVM after obtaining a low-rank approximation of the kernel matrix.The new algorithm overcomes the problems that the solutions to R-LSSVM obtained by existing algorithms are not sparse and they are difficult to deal with big data problem.Moreover,the computational complexity of the new algorithm is lower than that of the existing algorithm,and it can efficiently train large-scale datasets.Numerous experimental results demonstrate that the SR-LSSVM can achieve better or comparable performance to other related algorithms in less training time,especially when used to train large-scale problems.(2)Owing to requirement of computing full kernel matrix,kernel c-means clustering algorithm operates slowly,takes up large memory,and is difficult to process large-scale datasets.To solve this problem,a fast kernel c-means clustering algorithm based on incomplete Cholesky factorization is proposed.The algorithm uses incomplete Cholesky factorization method to obtain the approximate matrix of the full kernel matrix,i.e.the product of a low rank matrix and its transposition,and then runs the linear c-means clustering algorithm by using columns of the transpose of the low rank matrix as the input data points.We prove that when the eigenvalues of the kernel matrix decrease,the difference between the clustering results obtained by the new algorithm and those obtained by the kernel c-means clustering algorithm with full kernel matrix decreases exponentially.Experiments show that the performance of the proposed algorithm is similar to that of the standard kernel c-means clustering algorithm,but the new method can reduce memory,speed up operation,and is suitable for processing large-scale datasets.(3)In order to solve the KFCM model fast,three fast KFCM algorithms based on DC programming and DCA algorithm are proposed.Firstly,we prove that KFCM model can be transformed into DC decomposition under certain conditions,and two algorithms based on DCA are proposed.Then,in order to improve the computational efficiency of kernel matrix in the second new algorithm,we adopt an approximation strategy.It avoids calculating the entire kernel matrix,and the cluster center is a linear combination of randomly selected samples rather than all of them.Lastly,we alternatively operate KFCM and our proposed DCA based KFCM algorithms for several times to find a good initial point.Experimental results show that the new algorithms are superior to the traditional KFCM algorithm in clustering accuracy,running time and iteration times.(4)Aiming at solving a class of non-convex optimization problems with multiple local optima in machine learning,SVRG-GOA is proposed based on graduated optimization algorithm(GOA)and stochastic variance reduction gradients(SVRG).In this method,a new smoothing version,which is closer to the original function than the existing method,is designed to smooth the original non-convex function into a series of locally strongly convex functions.Then,the SVRG algorithm is adopted to solve these smoothed locally strongly convex function iteratively,and the “global” optimal solution of a class of non-convex optimization problems is obtained.We prove that the variance of the updating rule in the SVRG-GOA is bounded,the new algorithm is convergent,and the iteration complexity is lower than that of the existing algorithm.Next,PSVRG-GOA algorithm is proposed based on the proximal SVRG(PSVRG)in order to avoid computing the gradient of convex function.It is proved that the PSVRG-GOA is convergent and has the same iteration complexity as SVRG-GOA.In addition,to avoid the algorithm trapped in a small area too early to search the global optimum,we set a larger shrinkage factor for the algorithm.To further speed up convergence,we set the step length as a relatively large fixed value.In order to further reduce variance,mini-batch technique is adopted.Finally,the algorithm is extended to minimizing the optimization problem of sum of non-convex functions.Experimental results show that the new algorithm can faster converge to the “global” optimal solution of non-convex problems than the existing algorithms.(5)Stable sparse subspace embedding algorithm(S-SSE)is proposed for feature extraction in high dimensional datasets based on the idea of sampling without replacement in the statistics.The non-zero entries in the new sparse matrix are uniformly distributed among rows.We prove that the S-SSE matrix is more stable than the existing matrix,and the new method can achieve high approximation accuracy of Euclidean distance.It overcomes the problems that the distribution of non-zero entries among rows of the existing sparse matrix is nonuniform,and the matrices change greatly,because the selection of row labels of non-zero entries is completely random which leads to large variances.Experimental results confirm our theoretical results and show the superiority of the new method compared with the existing algorithms.
Keywords/Search Tags:sample sparsity, Cholesky factorization, feature extraction, non-convex optimization, support vector machine, clustering, stochastic gradient
PDF Full Text Request
Related items