Font Size: a A A

Research On Large-Scale Sparse Representation Algorithms Based On Stochastic Optimization

Posted on:2022-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:B K WeiFull Text:PDF
GTID:2518306605489804Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the development of science and technology and the continuous advancement of Internet technology,the amount of data available for analysis in the fields of artificial intelligence,data mining,deep learning,and machine learning has also increased.The problem we have to face is not only how to design algorithms to obtain solutions to some problem models,but also how to design some efficient and practical algorithms for large-scale and highdimensional data.This paper mainly considers the large-scale optimization problem models with sparsity constraints.The main algorithm for solving the model is the hard thresholding algorithm,but due to the high complexity of the hard thresholding operator and the gradient calculation of the current mainstream algorithms,this is a huge drawback for large-scale high-dimensional data.Therefore,how to design efficient algorithms becomes particularly important.At the same time,with the development of Internet technology and the popularization of mobile phones,almost everyone generates huge amounts of data every day.How to protect data privacy while designing algorithms is also an important topic.We designed a privacy protection algorithm for the sparse representation problem.In response to the above problems,this paper proposes three novel hard thresholding algorithms,which greatly improve the efficiency and performance of algorithms for large-scale or sensitive data protection.1.For the support set algorithm in the sparse representation problem,we propose a gradient support framework based on relaxation constraints,and within the framework we can use many different basic algorithms for different models.Based on our proposed framework,we embed the current better performance of random variance reduction gradient descent algorithm,so as to propose a random variance reduction gradient support set algorithm.From the full theoretical analysis,the algorithm we proposed has great advantages in theory.Experimental results show that the algorithm can guarantee good accuracy and efficiency on large-scale data sets.2.In order to improve the efficiency of the algorithm when dealing with large-scale data,this paper proposes a double random variance reduction hard thresholding algorithm,which is called SBCD-HTP.The algorithm uses the idea of random optimization to randomly sample from the sample and uses random block coordinates to update part of the high-dimensional features.In order to reduce the complexity of the hard thresholding operator,we postpone the hard thresholding mapping,and then perform the sparse hard thresholding mapping after multiple gradient updates.Based on the proposed SBCD-HTP algorithm,we use the idea of sparse approximation to sparse the gradient update,so that we can use the characteristics of multi-threading to accelerate.Theoretical results show that both serial and asynchronous parallel algorithms have linear convergence rates and the algorithm complexity is reduced.A large number of experimental results show the effectiveness and practicability of the algorithm.3.In order to protect the privacy of application data,we introduced the concept of differential privacy when solving the group structure sparse problem,and proposed a group sparse gradient descent hard thresholding algorithm based on differential privacy.The algorithm takes gradient perturbation as the starting point to protect privacy,and introduces Gaussian noise to perturb the gradient information while performing gradient descent calculation.In this way,the model solution finally solved by the algorithm will not be easily attacked and reversed by using relevant data.The upper bound of the algorithm's utility and the complexity of the gradient have been improved accordingly.The experimental results show that the upper bound of the utility of our algorithm has a greater advantage than other algorithms when it is high-dimensional.When the privacy budget is the same,our algorithm has a higher accuracy.
Keywords/Search Tags:sparse representation, hard thresholding, stochastic optimization, differential privacy, coordinate descent
PDF Full Text Request
Related items