Parallelization Research On Collaborative Filtering Algorithm Based On Cloud Computing

Posted on:2014-02-01

Degree:Master

Type:Thesis

Country:China

Candidate:B Y Li

Full Text:PDF

GTID:2248330398976769

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the development of information technology, the data on the network is being rendered explosive growth. The condition of excessive information forces the user to find useful information from the ocean of information which spends more time and energy.In this context, the recommendation system is invented to help users find the interested information. Currently, the popular recommendation system is collaborative filtering. The algorithm uses the interest similarity between users to make recommendations on the user’s preference information. However, with the growth of the data, the computational efficiency of the collaborative filtering algorithm becomes more and more inefficient. Based on this situation, the article uses parallel computing to investigate and study computational efficiency of collaborative filtering under the experimental condition of large data.Cloud computing seen as the development of parallel computing technology, can effectively solve complex computational efficiency. Currently, the popular cloud computing platform is Hadoop, and this article uses it as the implementation platform. In the Hadoop platform, to implement parallel computing to achieve collaborative filtering, the key is to solve the data correlation in the calculation process. Restricted Boltzmann Machines model and k Nearest Neighbours model are taken for example. on the basis of detailed analysis of the calculation process, the algorithm based on Hadoop platform is proposed. According to the characteristics of the MapReduce framework, the algorithm splits the calculation process into a number of tasks. In each task, the data replication is assigned to each computing node with data redundancy mechanism which solves the data correlation in the calculation process. Meanwhile, in the calculation process of a plurality of tasks, each task depends on the relationship of the front and rear. When MapReduce splits collaborative filtering into multiple tasks, the algorithm uses dependencies modular MapReduce to implement parallel computing which solves the dependencies between tasks.Finally, we use experiments to verify the above algorithm. In the experiments, the comparative analysis between Hadoop platform implementation and the previous implementation draws the conclusion that the Hadoop platform improves the computation efficiency of the nearest neighbor recommendation and Restricted Boltzmann Machines under conditions of large data sets.

Keywords/Search Tags:

Collaborative filtering, K Nearest Neighbors, Restricted Boltzmann Machines, Parallel processing, Cloud computing, Hadoop

PDF Full Text Request

Related items

1	Research On Curriculum Recommendation Algorithm Based On Restricted Boltzmann Machine Collaborative Filtering And Hadoop-Mahout
2	A Study On Collaborative Filtering Model Based On Depth Learning
3	Implementation Of Distributed Recommendation Algorithm Based On Improved Restricted Boltzmann Machine
4	Restricted Boltzmann Machines: A Collaborative Filtering Perspective
5	Research On Collaborative Filtering Recommendation Algorithms Based On Restricted Boltzmann Machine
6	Research And Application Of Collaborative Filtering Algorithm Based On Restricted Boltzmann Machine
7	Research Of Deep Learning Method Based On Restricted Boltzmann Machines
8	A Method Of Improving Restricted Boltzmann Machines Via Theta Pure Dependency
9	Research On Learning Algorithms For Restricted Boltzmann Machines
10	Research On Collaborative Filtering Recommendation Algorithms Based On Positive Correlation And Negative Correlation Nearest Neighbors