Font Size: a A A

Design And Implementation Of A Comprehensive Experimental Platform For Distributed Computing

Posted on:2022-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y CuiFull Text:PDF
GTID:2518306605466244Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of mobile Internet technology,the amount of data generated by many Internet companies has grown exponentially.In this case,some applications require very large computing power to complete,and traditional centralized computing technologies can no longer meet the needs of large amounts of data.The need for storage and processing.Distributed computing can divide large tasks or data into many small subtasks,which are allocated to other machines for processing,and all the calculation results are combined to obtain the final result,which greatly improves computing efficiency.In this context,the informatization process of universities is constantly advancing,and major universities hope to strengthen the training of talents in big data,distributed computing,etc.The teaching experiment platform,as a very important application in the teaching process of universities,should also be tight follow the pace.However,research has found that university laboratories have the typical characteristics of many users and difficult management.When conducting distributed computing experiments,it is usually necessary for each user to use multiple virtual machines on a single computer to build a cluster environment and configure the environment.It is complicated and takes a lot of time,and running distributed algorithm programs is inefficient,and the effect is unsatisfactory;it often takes a lot of time for teachers to manually check the source code when performing distributed computing assessment,which is extremely inefficient.Although there have been many related experimental platforms,most of them focus on how to efficiently use and manage experimental resources,and how to efficiently handle distributed computing experiments and auxiliary experimental assessments is not perfect,and the functions are lacking.In this thesis,by analyzing the current status of laboratory construction in colleges and universities,aiming at the above problems,the cloud computing and college education are effectively combined to design and implement a comprehensive experimental platform for distributed computing based on Hadoop.On this platform,there is no need to build an experimental environment.You can directly submit the program code to the platform through a browser,and the platform will automatically complete the calculation.At the same time,the platform provides functions such as recording and automatic judgment of the user's experimental process and experimental results,as well as automatic statistics of the user's completion of the experiment.The main work of this thesis are as follows:(1)The design and implementation of distributed cluster deployment plan.This thesis uses three servers to deploy a highly reliable and highly scalable distributed cluster environment by studying the core architecture design,basic principles and workflow of Hadoop-related components such as HDFS,MapReduce,and Yarn.(2)The design and implementation of remote calling plans.Research the commonly used network transmission protocols,compare their advantages and disadvantages,put forward the concept of this dissertation,use the more secure SSH protocol for encrypted connections,remote task calls,and remotely submit task to distributed clusters and execute them through the Web server.(3)The design and implementation of task scheduling and monitoring plan.Aiming at the execution process of tasks in the platform,a task scheduling scheme based on thread pool is designed to coordinate multiple tasks submitted by multiple users with distributed cluster load balancing.Finally,for the distributed computing comprehensive experimental platform,this article uses three virtual machines to build an experimental environment and test it.The experimental results show that the platform can effectively complete distributed computing tasks,realize the completion of calculations and verify experimental results by sharing platform software and hardware resources among multiple users,and provide functions such as auxiliary experimental assessment,which solves the deployment of experimental links.Problems such as difficulty,low efficiency and low resource utilization have greatly reduced the pressure on teachers.
Keywords/Search Tags:Distributed computing, Hadoop, HDFS, MapReduce, College education
PDF Full Text Request
Related items