Font Size: a A A

Research On The Performance Of Hadoop Medium Cluster Based On Docker

Posted on:2019-12-23Degree:MasterType:Thesis
Country:ChinaCandidate:J LingFull Text:PDF
GTID:2428330566499342Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of the online world,the scale of data has grown tremendously.Faced with such massive data,the Hadoop platform has become the mainstream platform for big data processing,with its excellent data processing capabilities and low hardware requirements.As the most widely used big data processing framework,Hadoop can be optimized to handle different big data applications.Hadoop will continue its unshakeable position in the era of big data,and the optimization of Hadoop will also be a long-term issue.In this paper,Docker container technology to fully integrate the existing hardware resources,and on the basis of a detailed analysis and study of the Hadoop cluster to achieve a performance optimization of medium clusters Hadoop based on Docker containers.After detailed analysis of the execution process of Map/Reduce,the optimization of its setup/cleanup task,job/tasks notification mechanism,and data allocation was proposed.By analyzing the setup/cleanup task,the original heartbeat mechanism is changed to the direct notification mechanism.The important event notification in job/tasks is changed from the original heartbeat mechanism to the RPC mode,and the other event notification is still based on the original heartbeat mechanism.Data distribution is based on the different computing power of each node,by calculating the performance ratio for more reasonable data storage location allocation.Based on the above research,the task execution time of the test optimization scheme is compared with that of the unoptimized task.Finally,the influence of the optimization scheme on the system load is evaluated based on the network traffic,CPU usage and memory usage.The test results show that the optimization scheme in this paper improves the performance of Hadoop small and medium clusters and has an acceptable impact on the system load.
Keywords/Search Tags:Hadoop, Map/Reduce, Docker, Cluster
PDF Full Text Request
Related items