Font Size: a A A

Design And Implementation Of Big Data Platform Based On Hadoop And Its Application In Recommendation System

Posted on:2017-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:M LiuFull Text:PDF
GTID:2348330518495683Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of mobile communication,especially the rapid expansion of 3G/4G mobile communication network and the rapid development of cloud computing technology,all kinds of Internet technology is not only applied in the PC terminal,but also the rapid expansion of smart devices and a variety of cloud services.Along with it come the data of a big bang,in the era of big data,we have to try to store and analyze these huge amounts of data.But those tons of data can not be stored and analyzed for any single hardware.Accordingly,based on the problem above,this paper aims to build a large data platform which the big data can be collected,stored,analyzed and calculated.In this thesis,we propose a cluster platform construction model based on the Hadoop ecosystem and its components,which can be used for the storage and analysis of large scale data.The model is proposed for the storage of the large data volume,which is composed of data acquisition pretreatment system,data storage system,data calculation and analysis system and resources statistical analysis system.Data acquisition and preprocessing system is composed of Kafka cluster based on distributed message system,which makes the system can send data to the Hadoop based system and the real-time processing system based on Spark.Data storage system is based on HDFS Hadoop distributed file system and HBase distributed database.Data computing and analysis system is a kind of parallel computing programming model based on MapReduce.Resource statistical analysis system is a statistical analysis of Hadoop data platform cluster computing resources and storage resources,that can monitor the cluster running state,and can be based on the system analysis module to get the operating status of the platform cluster in recent days.Finally,based on the Hadoop data platform and Mahout,the application of recommendation is studied and tested.
Keywords/Search Tags:data big bangbig, data platform, Hadoop, resources statistical analysis system
PDF Full Text Request
Related items