Font Size: a A A

Research On Product Recommendation Algorithm Based On Spark Big Data Platform

Posted on:2022-12-08Degree:MasterType:Thesis
Country:ChinaCandidate:J ChengFull Text:PDF
GTID:2518306614955369Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
With the development of the Internet industry,more and more Internet behaviors need to be recorded.The traditional data storage and processing methods have no way to meet the needs of the public.Hadoop,spark and other excellent big data cluster frameworks came into being.How to reasonably analyze their historical behavior and accurately recommend it to users has become a key technical hotspot at present.This paper's main works are the following two points: the traditional hashshufflemanager will produce temporary files when the data is large;Because this algorithm relies on the hash algorithm to divide the key,there is a data skew problem.This paper proposes an scheme of hashshufflemanager based on weight priority,allocates tasks according to the computing power,groups task set according to the number of tasks,and writes the calculation results of tasks in different groups into the same set of temporary files,so as to improve the system efficiency.The recommendation algorithm solves the problem of recommendation caused by too sparse input matrix.Firstly,the optimized slope one algorithm is used to pre improve the sparsity of the input matrix.Then,the cosine similarity calculation formula is used to evaluate the value.According the experiments,the threshold is obtained,and the data greater than the threshold is taken to optimize the sparse matrix.the ALS algorithm model is used to predict the user preference,so as to improve the accuracy of the system.In order to verify the improvement of the scheduling model and the effectiveness of the optimization of the recommendation algorithm model in this paper,many experiments are carried out in the fifth chapter of this paper,and the obtained data are compared,so that the design can provide users with more accurate recommendation services.
Keywords/Search Tags:Spark, Shuffle, Data skew, Slope one, Recommendation algorithm
PDF Full Text Request
Related items