Font Size: a A A

Researcn And Application Of Data Processing Based On Hadoop

Posted on:2017-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:S L LuoFull Text:PDF
GTID:2308330488450507Subject:Physical Electronics
Abstract/Summary:PDF Full Text Request
With the development of computer technology, especially the rapid development of Internet, data of all fields varies from requested data to being generated data are all have shown explosive growth. Traditional data storage,traditional management and the ability of applying data becomes increasingly hard when facing the practical request based on data processing and analysis.The emerging technique of cloud computing becomes the major technique of processing huge amounts of data. As a typical representative of cloud computing technology, Hadoop adopts the Distributed File System HDFS(Hadoop Distributed File System)and Mapreduce to achieve the mass data storage and computing. Meanwhile, Hadoop has been widely utilized and has got rapid development since the platform is characterized by open source and low cost of cluster composition.In recent years, the rapid development of domestic tourism leads to the demand of tourism recommendation service. Individuals could collect lots of information from the Internet, but the information quality could not be guaranteed, the information collection and analysis are also quite time-consuming, audience service is limited and unpromptly. The excellent recommendation of tourist routes and accommodations requires huge amounts of actual-time data collection from network,as well as big data calculation based on recommended model. Therefore, Hadoop could be an excellent choice,a system which could effectively and reliable data storage and data processing is needed to improve the situation mentioned above, Hadoop emerges to be an excellent choice.Systematized analysis and comparison of current various data processing techniques is provided in this paper. We’ve deeply researched the core technique of Hadoop:HDFS and graphs. We have built a Hadoop paltform,and optimized the Hadoop cluster. Simulated courist data has been tested, the result suggests that the optimized Hadoop cluster has shown 3.5 times processing performance better than the unoptimized one.According to the test result, we’ve provided a design method to build tourist service recommendation model which is based on Hadoop.
Keywords/Search Tags:Hadoop, MapReduce, HDFS, The data processing
PDF Full Text Request
Related items