Since entered the Internet era, human began to enter the era of information explosion. For example, a minute more than 100000 new data access to Twitter; The famous social network Facebook page views over six million. Hint we are in the era of information explosion. How to store these huge amounts of data, and how to calculate the data become a problem we must face. Cloud computing provides better a new computing model (distributed). Has been in the business world and academia has attracted widespread attention, and the quantities of data processing and storage become a research upsurge.We are now in an era of big data, recommendation system is facing huge amounts of user data, the goods huge amounts of data, and constantly produce every day new web log, user behavior records, new content and so on. How to store and deal with the huge data is a thorny problem. Using traditional RDBMS for large amount of data read and write is a bottleneck, and the traditional database is not suitable for distributed storage, petabytes of data need to be uniform storage to different nodes, but also need a parallel computing framework to deal with these data. Hadoop is the ideal choice, development from 2004 up to now, the Hadoop and widely applied in large data processing. Hadoop to Hadoop distributed file system, as well as parallel processing programming calculation model graphs and on such a file system can easily complete the distributed computing. This paper chose the Hadoop as a distributed platform.This thesis mainly studies the advantages and disadvantages of all kinds of mainstream recommendation algorithm, and on the engineering design recommendation engine combination. Distributed file system on H adoop platform HDFS and calculating model of M/R educe ap, on the basis of in-depth analysis and research are given based on the H adoop platform of cloud computing hybrid recommendation system and related test. |