Font Size: a A A

The Research On Distributed Collaborative Filtering Algorithm

Posted on:2016-11-23Degree:MasterType:Thesis
Country:ChinaCandidate:C Q WangFull Text:PDF
GTID:2308330476954990Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Web 2.0, the amount of data in the Internet grows explosively at an alarming rate. It becomes more and more difficult to distinguish the information people need from the huge amounts of junk information. Personalized recommendation engine emerges at the right moment. Collaborative Filtering(CF) algorithm is one of the classic recommendation algorithms. Although we have made great achievements in theoretical research on CF and put them into practice, it still faces problems such as data accuracy, performance, cold start and sparsity. And with the increasing amount of data, the performance of the CF is particular problematic. It is an irresistible trend to implement the CF on cloud computing platform or GPU computing platforms. The preservation and processing ability of Cloud computing platform and the high parallel computing ability of GPU would become necessary for the need of development in CF.Firstly, this paper introduces the principle, classification of CF algorithm in details, and then it describes the main problems that the CF algorithm faces. After that it introduces the relevant standards to evaluate CF algorithm. Finally it presents the cloud computing platform and GPU computing platform which related to CF recommendation.Secondly, it introduces the principle of parallel computation used on GPU computing platform detailedly. And use Vector Add Algorithm as an example to testify the high speed of parallel GPU computation. On this basis CF algorithm was implemented on the GPU platform and then the performance of the algorithm was analyzed. The experimental results shows that it guarantees the accuracy and recall rate and improves about 100 times in the performance compared to the traditional one.Thirdly, this paper presents CF algorithm implemented on cloud computing platform. And to deal with the accuracy and the cold start problem, it puts forward the CF algorithm based on two-phase similarity. This article introduces the principle and the implementation steps of CF algorithm based on two-phase similarity detailedly. Then the accuracy, recall rate of the algorithm based on two-phase similarity and the ability to deal with the cold start problem is analyzed through experiments. The experiments result shows that the CF algorithm based on two-phase similarity has improved the accuracy and recall rate.Finally, design and implement a recommendation system based on cloud computing platform, the GPU computing platform and CPU platform. In this system, we can manage the resource, the clusters and the algorithms. We can create a job, and run the job in different algorithms on different platforms. We can download the recommendation result and analyze the accuracy and recall rate of the result. And it displayed the accuracy and recall rate of the result in the form of a bar chart.
Keywords/Search Tags:Machine Learning, Collaborative Filtering, Cloud Computing, GPU Computing
PDF Full Text Request
Related items