Research On Optimizing I/O Performance Of High Performance Computing Systems

Posted on:2019-12-21

Degree:Doctor

Type:Dissertation

Country:China

Candidate:J Yu

Full Text:PDF

GTID:1368330623950479

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the growing demands of scientific applications,the computing capability of high performance computers are entering exascale level.However,the development of I/O performance cannot keep pace with the computing power,which results in serious I/O bottleneck problem.It results in more time for applications to wait for data I/O,and significantly impacts the overall performance of systems.Researchers are building storage systems on top of compute node-local fast storage devices(such as NVMe SSD)to alleviate the I/O bottleneck.However,user jobs have varying requirements of I/O bandwidth,therefore it is a serious waste of expensive storage devices to have them on all compute nodes and build them into a global storage system.In addition,current node-local storage systems need to cope with the challenging small I/O and rank 0 I/O pattern from HPC workloads.In this paper,we presented a WorkloadAware Temporary Cache(WatCache)to meet above challenges.We designed a workloadaware node allocation method to allocate fast storage devices to jobs according to their I/O requirements,and merged the devices of the jobs into separate temporary cache spaces.We implemented a metadata caching strategy that reduces the metadata overhead of I/O requests to improve the performance of small I/O.We designed a data layout strategy that distributes consecutive data that exceeds a threshold to multiple devices to achieve higher aggregate bandwidth for rank 0 I/O.Job scheduling in HPC systems by default allocate adjacent compute nodes for jobs to minimize the communication overhead.However,it is no longer applicable to dataintensive jobs running on systems with I/O forwarding layer,where each I/O node performs I/O on behalf of a subset of compute nodes in the vicinity.Under the default node allocation strategy a job's nodes are located close to each other and thus it only uses a limited number of I/O nodes.Since the I/O activities of jobs are bursty,at any moment only a minority of jobs in the system are busy processing I/O.Consequently,the bursty I/O traffic in the system is also concentrated in space,making the load on I/O nodes highly unbalanced.In this paper,we use the job logs and I/O traces collected from Tianhe-1A to quantitatively analyze the two causes of spatially bursty I/O,including uneven I/O traffic of job's processes and uneven distribution of job's nodes.Based on the analysis we propose a node allocation strategy that takes account of processes' different amounts of I/O traffic,so that the I/O traffic of data-intensive jobs can be processed by more I/O nodes more evenly.I/O forwarding layer has now become a standard storage layer in today's HPC systems in order to scale current storage systems to new levels of concurrency.With the deepening of storage hierarchy,I/O requests must traverse through several types of nodes to access required data,including compute nodes,I/O nodes and storage nodes.It becomes difficult to control the data path and apply cross-layer I/O optimization.In this paper,we analyze the problems brought by uncoordinated I/O stack,including the inability to expose cross-node data locality,load imbalance on I/O nodes,and I/O contention on storage nodes.To this end,we propose a well coordinated I/O stack,which coordinates data path between compute nodes and I/O nodes for better load balancing and data locality with a job-level I/O node mapping,and coordinates data path between I/O nodes and storage nodes for lighter I/O interference.

Keywords/Search Tags:

Parallel I/O, Flash Cache, I/O Forwarding, Spatially Bursty I/O, Node Allocation Strategy, Cross-layer I/O Coordination

PDF Full Text Request

Related items

1	Temperature-aware Data Allocation Strategy For 3D Flash Memory
2	Research On Cross-Layer Wireless Resource Allocation Of Lte/Lte-A
3	Research On Relay Cache And Forwarding Strategy For 6LoWPAN Based On Mesh-under Routing
4	Interference Coordination In Heterogeneous Cellular Networks
5	Exploring Inherent Features Of Flash Memory For SSD-based Cache Optimization
6	Research On Cross-layer Resource Allocation In Next Generation Mobile Communication System
7	The Research Of Zone-based Flash Translation Layer For Nand Flash-based Storage Systems
8	Research And Implementation On Parallel Storage Technique In Multi-bank Flash
9	Study On The Resource Allocation For B3G/4G Systems
10	Research On The Critical Technology Of NAND Flash Storage Management Based On YAFFS2