| The hardware and software of the Raspberry Pi has flourished since its inception.This paper chooses Raspberry PI as a cluster system.Study the new features of ARMv8 architecture of raspberry PI.The performance of iterative algorithms of sparse matrix operation such as SGS,LU-SGS and GMRES in ARM architecture is accelerated.The main work of this paper is as follows:(1)Build a self-developed ARM architecture raspberry PI cluster system.In order to make full use of the features of the latest version of raspberry PI,one of the tasks of this module is to upgrade the software system of Raspberry PI,including compilation of toolchain and operating system,migration of relevant mathematical library and construction of parallel environment.The second part of this module is to build a cluster system,using a cabinet to organize 50 raspberry PI.Realize single sign-on,build tree network topology,and realize a high-performance raspberry PI cluster.(2)Set up monitoring system.In order to better monitor the use of each node in the calculation process of raspberry PI cluster.Ganglia was selected as the monitoring platform deployed in the Raspberry PI cluster system.In the secondary development,the main interface of the visual platform was cut to make it more clear and concise.Monitoring items closer to the Raspberry PI cluster were modified and added.(3)Performance evaluation of raspberry PI cluster based on HPCG.The raspberry PI cluster built in this paper has excellent performance under the HPCG evaluation benchmark:ten raspberry PI can achieve a performance improvement of six to seven times that of a single raspberry PI.When the whole cluster is calculated with fifty nodes,the performance improvement of a single raspberry PI can be achieved in this benchmark evaluation of 30 to 40 times that of a single raspberry PI.(4)Sparse matrix iterative algorithm achieves acceleration.Three iterative algorithms SGS,LU-SGS and GMRES were solved using grid data,and four versions of C++single-process version,C++ multi-process version,C multi-process version(MPI implements GMRES algorithm)and C multi-process version(PETSc implements GMRES algorithm)were realized.The results show that the three algorithms can achieve convergence acceleration of more than 10 times after MPI acceleration. |