Font Size: a A A

Research And Application Of Clustering Collaborative Filtering Recommendation Algorithm Based On Hadoop

Posted on:2017-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:J H XuFull Text:PDF
GTID:2348330488476189Subject:Control engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, increasing the amount of network data, we have entered the era of big data. A lot of information and merchandise presented to the user at the same time, so that we are facing a serious problem-information overload, and personalized recommendation is an effective solution of the issue, collaborative filtering method is the most widely used in the actual recommendation system, which provides recommendations for the user based on groups'preference. Traditional single collaborative filtering algorithm have been unable to meet the processing needs of the vast amounts of information in terms of efficiency, or from the computational complexity, the development of cloud computing technology provides a new research direction for the recommendation algorithm. So, you can consider using the combination of big data technology to address issues such as scalability algorithms.In this paper, research and realization for clustering collaborative filtering algorithm research based on Hadoop big data processing techniques and analysis for its application to the film data sets is done. Concepts related to two major Hadoop framework of classical clustering algorithms and recommendation algorithm were mainly studied; collaborative filtering recommendation algorithm based distributed clustering Hadoop big data processing techniques to solve collaborative filtering algorithm was proposed to deal with data sparsity and algorithms scalability issues; as for the sparsity of the data, using the matrix factorization to make a preconditioning for initial data, and establish clustering model by clustering algorithm with the preprocessed data; and then form the recommended candidate space using the cluster model and collaborative filtering algorithm.finally finish the recommendtation.Summarizes of the focus of this paper:1) Do a research and analysis for the commonly used clustering algorithms, a comprehensive understanding of the advantages and disadvantages of various types of typical algorithms, and focuses on K-means clustering algorithm.2) Do a in-depth research and analysis for the classical recommendation algorithm, espeically the collaborative filtering recommendation algorithm.3) Use matrix decomposition algorithm to do preprocessing for the sparsity of data, then using the improved K-means clustering algorithm to construct clustering model.4) Combine cluster model with collaborative filtering recommendation algorithm to do the mixing.5) The K-means clustering algorithm, collaborative filtering algorithm was improved so that it can adapt to the MapReduce programming model, thus achieve the purpose of the data distributed processing, in order to solve scalability problem of the algorithm.6) Evaluate on the hybrid recommendation algorithm.The subject use MovieLens data in a dataset, do verification for algorithms by the application of experimental data sets, and to analysis for the experimental results.Experiments show that the technology used in Hadoop cluster collaborative filtering algorithm can effectively improve the quality of recommendation system and greatly improve the efficiency of recommendation, also has a good scalability in the cloud environment.
Keywords/Search Tags:Hadoop, K-means, collaborative filtering, MapReduce, matrix decomposition, maximum and minimum
PDF Full Text Request
Related items