Research And Implementation Of Personalized Recommendation System Based On Spark

Posted on:2018-10-12

Degree:Master

Type:Thesis

Country:China

Candidate:Y Z Zhang

Full Text:PDF

GTID:2348330512488123

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

In the background of big data,the rapid development of modern Internet will produce a lot of data every day,how to extract valuable data from the massive data has great significance.In the big data age,information is extremely rich,but also because people are faced with the problem of information overload,users need to spend a lot of time to filter out the information they are concerned.How fast and accurate to find out the information they are interested in has become more and more difficult.In order to solve this problem,personalized recommendation system came into being.Because of the need to filter valuable data from massive amounts of data,the recommended system needs to face a very large scale of data.In order to be able to quickly and real-time response to users' needs,which requires the recommendation system has a strong data analysis and processing capabilities.At present,the mainstream big data processing framework includes Hadoop and Spark,and Spark is a new generation of parallel computing framework,which has become a research hotspot in the field of big data processing.The recommended system is built on the Spark framework,combined with Spark's powerful data processing capabilities,with the benefits of its memory computing,will greatly enhance the performance of recommended systems.This paper mainly studies the design and implementation of personalized recommendation system based on Spark framework,and improves some shortcomings in the algorithm.The main work of this paper includes the following aspects:1)The dissertation analyzes the practical application characteristics of several mainstream recommendation algorithms and the shortcomings of the algorithms,which are collaborative filtering algorithm,content-based recommendation algorithm and recommendation algorithm based on singular value decomposition(SVD).2)The dissertation design and implement an efficient data warehouse in conjunction with the type of storage file�Parquet,so that the recommendation system can read and write quickly in the calculation process.The data warehouse is the cornerstone of this paper to achieve the recommended system,greatly improving its computational efficiency.3)The dissertation using Spark's programming model to design and implement four groups of recommended algorithms,combined with Spark parallel computing capabilities,designed to achieve the corresponding four groups of recommended engine.By adding the project attribute characteristics to improve the project-based collaborative filtering algorithm,its performance has been significantly improved.4)According to the above four groups recommended engine of dissertation,we combine them and design a mixed recommendation model.We realize a more personalized recommendations by dynamically adjusting the parameters and combination selection difference recommended engine in different scenarios.

Keywords/Search Tags:

Spark, Recommendation System, Big Data, Individuation, Parallelization

PDF Full Text Request

Related items

1	Research And Implementation Of Video Recommendation System Based On Spark
2	An Item-based Collaborative Filtering Recommendation Algorithm Optimization And Parallel Implementation On Spark Platform
3	Research And Implementation Of Hybrid Movie Recommendation System Based On Spark Technology
4	Recommendation Algorithm Based On Trust Network And Its Parallelization In Spark Platform
5	Research On Improvement Of Recommendation Algorithm Based On Spark
6	Research And Implementation Of Classification Algorithm Parallelization Based On Spark
7	Research On Intelligent Recommendation System Based On Flow Data Cube
8	Optimization And Implementation Of The News Individuation Recommendation System Based On Python
9	Research And Optimization Of Recommendation Algorithm Based On Spark Platform
10	Research And Application Of Parallelization Optimization Of Spatial Clustering Algorithm Based On Spark