Font Size: a A A

Research And Design Of Recommendation System Under Distributed Computing Mode

Posted on:2022-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:W ChenFull Text:PDF
GTID:2518306341987169Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of modern information technology,people can quickly get the information they need.However,with the rapid growth of information,it becomes more and more difficult to get the information they want from the massive data.Recommendation system provides a powerful technical support to help people filter information.In order to solve the problem of data sparsity and cold start caused by a single recommendation algorithm when the amount of data is too large and the correlation degree between users and items is not high,this thesis carries out research on hybrid recommendation system under distributed computing mode.The main research contents are as follows:(1)In order to improve the effect of offline recommendation,the offline recommendation algorithm model is designed using the recommendation algorithm based on statistics and the collaborative filtering recommendation algorithm based on cryptic meaning.Based on mathematics and statistics knowledge,a computational model is established to train the data sets that are not or are rarely affected by time,and the relevant offline recommendation list is obtained to realize offline recommendation.(2)In order to realize real-time personalized recommendation,a real-time recommendation algorithm model is designed.Data collection of real-time users is completed and written back to the offline database through relevant processes such as access to data,log collection,data preprocessing and distributed streaming computing,and recommendation results are obtained by calculation.(3)The improved TF-IDF algorithm is used to adjust the label weight.In combination with the real-time recommendation model,the similarity is calculated with the collected data during the real-time user information collection,and an appropriate amount of data is selected to form the real-time recommendation list,so as to realize the content-based recommendation algorithm and make the real-time recommendation results better meet the needs of users.(4)By building a distributed experimental platform under Spark and Hadoop mode,the distributed cluster technology combined with the hybrid recommendation algorithm model was used to conduct a comparative experiment based on the running efficiency of the hybrid recommendation algorithm,and the same size data set was used to conduct a comparative experiment on the running speed of each platform.Finally,the influence of data size on root mean square error(RMSE)is analyzed on distributed platform Spark to verify the performance advantages of distributed computing mode and recommendation accuracy under massive data sets.(5)A complete distributed hybrid recommendation system is designed and implemented by adopting the design idea of the hybrid recommendation algorithm model,and a movie recommendation system based on Spark is implemented by using the open source movie data set.The offline recommendation and real-time recommendation are designed and implemented from the perspective of engineering.The distributed tools are used to complete the communication among various recommendation algorithms under the Spark platform.The visual platform is built by combining with the open source Web front-end framework,and the main functions of the recommendation system are realized.The research shows that the hybrid recommendation system adopted in this thesis can effectively solve the problem of data sparsity and cold start through the designed hybrid recommendation model,improve the data processing speed,and provide a new idea for big data processing.
Keywords/Search Tags:Hybrid recommendation, Spark, Recommendation algorithm
PDF Full Text Request
Related items