Font Size: a A A

Research On Key Technology Of I/O Knowledge-aware Parallel Storage System

Posted on:2018-05-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:W R DongFull Text:PDF
GTID:1368330623950316Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of high-performance computing technology,the computing performance of high-performance computer has greatly improved in the last decade.However,the performance of storage subsystems is not growing at a matching rate,leading to the widening of the performance gap between computing systems and storage systems.Computing resources are often idle for waiting for the I/O completion,resulting in a waste of computing resources.On the other hand,increased I/O time also leads to an increase of application lasting time.For a high-performance computer that has already been built,I/O performance is often significantly less than the peak performance that can be provided by the storage system due to I/O interference and other factors.Therefore,how to design efficient I/O optimization technology is the current key issue of HPC.Caching and prefetching,I/O request scheduling are proved to be effective means of I/O optimization.With the growing scalability of supercomputers and the high complexity of the varied applications,these technologies have a great of the optimization space,which face the main problem of how to effectively use the rich I/O features.In this paper,we focus on the I/O performance optimization with the I/O knowledge,and study the techniques of I/O characteristics of the acquisition,I/O feature-aware cache and prefetching and I/O feature-aware I/O scheduling technology.The main contributions of this paper are as follows:Due to the low flexibility and the low limited scope of I/O characteristics acquisition,this paper design a I/O characteristic acquisition and analysis tool based on the user space file system framework FUSE,called FTracer,to effectively and flexibly obtain the characteristics of each application with low overhead.In order to meet the diverse needs of users,FTracer also supports offline analysis and online analysis of the original I/O characteristic data.To reduce the overhead of acquisition and analysis,FTracer reduces the educe context switching times and increases the I/O trace request size by optimizing the framework of FUSE kernel module and user space library;avoids the interference of the I/O trace process and I/O online analysis on the efficiency by alternating the I/O buffers.To increase the scope of application of the characteristic analysis tool,FTracer provides the function of online interaction with the user and job manager to specify the operation node of the non MPI model.In high performance computing,data are often very stable,where how to improve the I/O performance of high performance computing applications by obtaining the value of being cached and prefetched according to the access history has become the an important means.However,due to the lace of file level I/O access characteristics,the existing distributed cache can lead to low efficient utilization of Cache space,longer I/O request delay and low cache hit rate.To this end,this paper proposes a file level I/O feature aware distributed Cache framework SFDC to improve the accuracy of caching and prefetching data.By transferring the information of the same data object to the same cache server,the SFDC server can access the file history information to infer the data cache and prefetch value.To avoid the single point cache server bottleneck problem caused by large files,SFDC splits large files into multiple objects,with each object in one cache server to disseminate the burden.To reduce the transmission delay of I/O data between cache client and cache server,SFDC uses the communication mechanism based on RDMA to match the high bandwidth of memory.To reduce the possibility of replacing the valued data and avoid the possibility of frequent replacement,SFDC precisely assess the value of the data in the cache to regularly clean the cache space.Due to the degradation of I/O performance caused by the competition and interference of the I/O scheduling,this paper presents a I/O flow control mechanism based on real-time load state storage devices,called DWFC,to reduce I/O interference between multiple concurrent applications and improve the overall performance of the I/O application.Based on the non-urgent characteristics of the large number of I/O requests,where the I/O request does not need to be completed immediately and delaying the I/O requests will not damage the performance of the application.Because of the difference of the real-time burden of the storage equipment,DWFC improves the processing efficiency of other application I/O request rate by delaying the data sending in high load equipment I/O requests.DWFC divides the I/O request according to the different targets of the I/O requests.To ensure the performance of I/O for each application,DWFC uses a US request scheduling algorithm based on the urgency,which considers each request timeliness of different facts,and ensure the performance of I/O applications by sending the more urgent request in priority.
Keywords/Search Tags:HPC, Parallel Storage System, I/O Characterization, I/O Cache, Prefetching, I/O interference
PDF Full Text Request
Related items