Font Size: a A A

Parallel Stochastic Gradient Descent Algorithm On Large-scale High-dimensional And Sparse Data

Posted on:2021-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:W QinFull Text:PDF
GTID:2428330611487195Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer networks in modern society and the explosive growth of information data,the advent of big data has facilitates the development of recommendation systems,which have improves the quality of people's daily lives.High-dimensional sparse matrices are often used in recommendation systems to quantify the relationships among users and items in an incomplete matrix.In order to obtain some useful information from high-dimensional sparse matrices,researchers have propose various big data analysis methods,in which latent factor analysis has been shown to efficiently obtain and represent information from high-dimensional sparse matrices.Recommendation systems based on latent factor analysis commonly adopt random gradient descent as the learning algorithm,while random gradient descent as a sequence algorithm has considerable time overhead and low scalability when dealing with large-scale industrial problems.To solve the above problems,this paper proposes some novel parallel strategies to improve the convergence rate and computational efficiency of the model.The main elements of the study are as follows.(1)The application of latent factor analysis methods in recommender systems is outlined,the problems of stochastic gradient descent algorithms in parallel are theoretically analyzed,and current parallel implicit feature models based on stochastic gradient descent are studied and analyzed.(2)Momentum combined with a parallel stochastic gradient descent algorithm is proposed.The algorithm adds momentum effects to the stochastic gradient decline and parallelizes the algorithm with a novel data segmentation strategy.Experiments on large-scale industrial datasets have shown that the algorithm improves the convergence speed and computational efficiency of the model.(3)A hierarchical parallel algorithm based on random gradient descent is proposed.The algorithm is parallelized by two hierarchies,and experiments on large-scale,sparse,real data sets show that hierarchical parallel implicit feature models based on random gradient descent have higher acceleration performance when solving large-scale matrix factor decomposition.
Keywords/Search Tags:big data, Latent Factor Analysis, High-Dimensional and Sparse (HiDS) Matrix, Stochastic gradient descent, Parallel computing
PDF Full Text Request
Related items