Font Size: a A A

Research On Optimizing I/O Performance Of High Performance Computing Systems

Posted on:2019-12-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:J YuFull Text:PDF
GTID:1368330623950479Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the growing demands of scientific applications,the computing capability of high performance computers are entering exascale level.However,the development of I/O performance cannot keep pace with the computing power,which results in serious I/O bottleneck problem.It results in more time for applications to wait for data I/O,and significantly impacts the overall performance of systems.Researchers are building storage systems on top of compute node-local fast storage devices(such as NVMe SSD)to alleviate the I/O bottleneck.However,user jobs have varying requirements of I/O bandwidth,therefore it is a serious waste of expensive storage devices to have them on all compute nodes and build them into a global storage system.In addition,current node-local storage systems need to cope with the challenging small I/O and rank 0 I/O pattern from HPC workloads.In this paper,we presented a WorkloadAware Temporary Cache(WatCache)to meet above challenges.We designed a workloadaware node allocation method to allocate fast storage devices to jobs according to their I/O requirements,and merged the devices of the jobs into separate temporary cache spaces.We implemented a metadata caching strategy that reduces the metadata overhead of I/O requests to improve the performance of small I/O.We designed a data layout strategy that distributes consecutive data that exceeds a threshold to multiple devices to achieve higher aggregate bandwidth for rank 0 I/O.Job scheduling in HPC systems by default allocate adjacent compute nodes for jobs to minimize the communication overhead.However,it is no longer applicable to dataintensive jobs running on systems with I/O forwarding layer,where each I/O node performs I/O on behalf of a subset of compute nodes in the vicinity.Under the default node allocation strategy a job's nodes are located close to each other and thus it only uses a limited number of I/O nodes.Since the I/O activities of jobs are bursty,at any moment only a minority of jobs in the system are busy processing I/O.Consequently,the bursty I/O traffic in the system is also concentrated in space,making the load on I/O nodes highly unbalanced.In this paper,we use the job logs and I/O traces collected from Tianhe-1A to quantitatively analyze the two causes of spatially bursty I/O,including uneven I/O traffic of job's processes and uneven distribution of job's nodes.Based on the analysis we propose a node allocation strategy that takes account of processes' different amounts of I/O traffic,so that the I/O traffic of data-intensive jobs can be processed by more I/O nodes more evenly.I/O forwarding layer has now become a standard storage layer in today's HPC systems in order to scale current storage systems to new levels of concurrency.With the deepening of storage hierarchy,I/O requests must traverse through several types of nodes to access required data,including compute nodes,I/O nodes and storage nodes.It becomes difficult to control the data path and apply cross-layer I/O optimization.In this paper,we analyze the problems brought by uncoordinated I/O stack,including the inability to expose cross-node data locality,load imbalance on I/O nodes,and I/O contention on storage nodes.To this end,we propose a well coordinated I/O stack,which coordinates data path between compute nodes and I/O nodes for better load balancing and data locality with a job-level I/O node mapping,and coordinates data path between I/O nodes and storage nodes for lighter I/O interference.
Keywords/Search Tags:Parallel I/O, Flash Cache, I/O Forwarding, Spatially Bursty I/O, Node Allocation Strategy, Cross-layer I/O Coordination
PDF Full Text Request
Related items