Font Size: a A A

Research On Key Technologies Of The MPI-based High Performance Cloud Computing Platform

Posted on:2014-03-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y C GuoFull Text:PDF
GTID:1268330425479889Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Cloud computing is a main technique for massive data processing, however it is inefficient for dealing with both data intensive and computational intensive problems. The low layer of the cloud computing uses the virtual technique, so that all system and application softwares execute on the virtual hardware, which reduce performance up to20percent pointed out by a literature. In other hand the MapReduce paradigm of cloud computing adopts the store-forward stratagem for medium data, which would create great amount I/O operations for big data and cannot be applied efficiently to high performance science computing.Based on above considerations and in view of the MPI weakness in fault-tollerent capability the dissertation focuses on developing a MPI-based high performance cloud computing platform (HPCCP), which configures the low layer of the platform directly using heterogeneous computing nodes without virtualization, and reprograms the MapReduce paradigm with integration of multilayer fault-tolerant MPI techniques and multi thread techniques to avoid great amount unnecessary I/O operations and increase the efficiency. The proposed and implemented MPI-based HPCCP platform prototype can efficiently deal with the data-intensive as well as computing intensive problems to satisfy high performance cloud computing requirements.The main creations of the proposed MPI-based HPCCP platform are as followings:1. A methodology, which configures the low layer of the cloud-computing platform directly using heterogeneous computing nodes without virtualization.The proposed and implememted MPI-based HPCCP platform in the dissertation, instead of adopting the fashionable virturalization techniques, fully takes advantage of the MPI ability of exploration and adaptivity in heterogeneous computing nodes to directly construct the IaaS layer of the cloud-computing platform. This is an important creation that increases productivity of the cloud plateform by decreasing harm influences of the virtualization to hardware capability in the IaaS layer.2. Amelioration and implementation of the MPI multi-layer fault-tollerent techniques. The weak fault-tollerent ability is a crucial defect of the MPI, comparing with its excellent ability of high performance computing, which limits the MPI application in big data processing. The MPI technique could not be adapted in the cloud computing provided that the defect would not be solved. The dissertation has comprehensively studied the MPI fault torrelent technoques, proposed and implemented three different fault tollenent techniques:job rescheduling, job/task recovering, and task dynamic migration, which are allocated in three different layers. This creation has remedied the defect of MPI in the fault tolerant ability, which is another distinguishing feature of the dissertation.3. An efficient MapReduce prototype of the MPI-based HPCCP platform has been designed and implemented.The data transfer in current MapReduce paradigm implemtation is encapsulated by the distributed file system (DFS), so that repeated I/O operations are taken place to the DFS during the data processing, which seriously reduce system efficiency. The dissertation reprograms the MapReduce paradigm on a redisgned multi-layer fault-tolerant MPI platform, which can directly process the medium results, reducing unnecessary I/O operation, speeding up the cloud computing and obtaining higher efficiency. Comparing with the Hadoop, the current fashionable implementation of the MapReduce, our MPI-based HPCCP can reduce a big data processing time of the fingerprint recognition to25percent.The dissertation has done intensive tests and case studies for the MPI-based HPCCP platform. Among them there are some of them:the influence of data block size to data processing performance; robustness and efficiency of the multi layer fault tolerancy; gerenal performance of the MPI-based HPCCP platform. Finally the comparision between the Hadoop and the MPI-based HPCCP platform has been done. The experiments have shown that the proposed and implemented cloud-computing platform in the dissertation has four more times better runtime than the traditional Hadoop platform.In the last section, conclusions and some to be solved problems have been listed. The near future reseach proposal is also described briefly.
Keywords/Search Tags:Cloud Computing, High Performance Computing, MultilayerFault-tolerant, Distributed Processing, MPI, Massive Data Processing
PDF Full Text Request
Related items