Font Size: a A A

Research On Converged Data Management Techniques For High-performance Computing Systems

Posted on:2021-02-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:P ChengFull Text:PDF
GTID:1488306548992659Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the convergence of high-performance computing(HPC),big data and artificial intelligence(AI),the HPC community is pushing for “triple use”systems to expedite scientific discoveries.Due to the explosive growth of scientific data and the fundamental differences in I/O characteristics among HPC,big data and AI workloads,supporting these converged applications on HPC systems presents formidable challenges in terms of storage and data management.Moreover,the emerging hierarchical storage architectures and diverse data locating requirements exacerbate the bottlenecks.To enable hierarchical and adaptive data management and accelerate the converged applications on HPC systems,this thesis proposes a spectrum of data management techniques.The main work of this thesis includes:1.The tiered data management system(TDMS)that integrates distributed and hierarchical storage spaces to stage data between the components of converged applications.TDMS allows users to make full use of the advantages of different storage tiers by customizing data management strategies for common data access patterns.As the use of hierarchical storage architectures changing the data locality,TDMS proposes data-aware task scheduling to launch tasks on compute nodes where the data locality of required data can be maximally leveraged.We evaluate the performance of TDMS with realistic applications and the experiments show that TDMS can optimize the I/O performance and provide up to 1.54 x speedup compared with the Lustre file system.2.Adaptive data management strategies that target hierarchical storage architectures.The adaptive data placement strategy combines the characteristics of converged applications and the runtime system information to choose the best storage tier for different data.It explores the idea of using machine learning techniques to make smart data placement decisions that can minimize the overall I/O time of the entire application.The adaptive data prefetching strategy recognizes the prefetch data and moves them to the top storage tier before it is requested.It utilizes the dataflow topology and graph embedding techniques to cluster files.For each cluster of files,locality-based prefetching and learning-based prefetching are combined to implement a low-cost and high-accuracy prediction model.Compared with the fixed data placement strategy,the adaptive data placement strategy achieves 1.3x speedups.The adaptive data prefetching strategy outperforms state-of-theart solutions and can reduce the cumulative read time by up to 54.2%.3.Lightweight index and query strategies that can easily couple with any in-use file system.For data locating requirements at the granularity of a file,the proposed index strategy extracts application-specific metadata from files resides in the underlying file system and constructs the customized hash index structure.For data locating requirements at the granularity of a record,the in-situ indexing philosophy is applied to generate lightweight Range-bitmap indexes when applications are writing data to the file system.Together with the in-memory cache layer,parallel query processing,and the consistent management strategy,the proposed file-level locating service is able to locate target files from directories containing millions of files in microseconds.The proposed record-level locating service dramatically reduces the index building time while maintaining up to two orders of magnitude query speedups than scanning the entire dataset.
Keywords/Search Tags:High-performance computing, Converged applications, Hierarchical storage architecture, Data management, Scientific workflows
PDF Full Text Request
Related items