Font Size: a A A

Research On Large Scale Parallel Storage Systems For Super Computing

Posted on:2015-01-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z L SongFull Text:PDF
GTID:1318330536967212Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Recently,supercomputing technology has been playing much more significant roles in scientific research and the development of national economy,and get more and more applications.Currently,supercomputing and data storage technologies are developing towards the Exascale era.It will present huge challenges for data processing performance in supercomputer systems since the dramatically increasing of computing speed and the scale of data.To deal with the challenges of large-scale storage systems in supercomputer,we focus our researches on improving the performance of IO subsystem by exploiting the new storage technologies.In this dissertation,the novel storage architecture,performance enhancement of metadata servers,performance optimization of Flash-based SSD and RAID technologies,and power optimizations are well researched,which will improve the performance of large scale parallel storage systems.The major contributions of this dissertation can be summarized as follows:1.By the lack of quantitative optimization strategies in large scale parallel storage systems,mathematical models of bandwidth and latency are explored and summarized,based on the analyses of the ideal model of parallel storage systems and the behaviors of parallel storage systems' research.On the basis of this mathematical model and optimization strategies,a novel software defined storage architecture is proposed.Experimental results show that the architect is more easy to be expanded,upgraded and optimized.2.To solve the problem of metadata management,a new metadata multi-partition parallel processing technology is proposed based on the partition parallelism and centralized management design idea,on the service platformof Flash array and large SMP node.It will support high concurrency on a single metadata server.Experimental results show that it significantly improves the IO throughput and scalability of metadata servers.3.To satisfy the requirements of high throughput and low lantency in large scale parallel storage system,a novel Flash translation layer is proposed based on the temporal and spatial localities and the data de-duplication,which can reduce needed RAM space of the mapping table.This new FTL can improve the hit ratio of page-level mappings in RAM and R/W performance with little overhead,and the performance of Flash-based solid state drives.4.To prolong the lifetime of SSDs,we also implement a high performance Flash-based RAID,which is cache-based reconfigurable policy.On one hand,the out-of-place updated pages continued presence in Flash,which can protect the valid data pages and reduce the frequent updates of parity.On the other hand,with the help of data reorganization in cache,it can decrease the number of updates in Flash than that in traditional RAID system,and prolong the lifetime of Flash memory array.5.A power consumption model is proposed for large-scale Flash-based storage systems,which is with the assumption that the whole storage system in supercomputers is established by Flash storages devices.At the same time,a novel communication classer annealing simulation algorithm is proposed to find local best solution by simulating the annealing simulation algorithm.It is achieved by classifying processes by its communication characteristics,which can build a suboptimal initial task graph.Experiments show that it is much faster,and can find better local optimal solutions than other current algorithms.
Keywords/Search Tags:Super Computer, Storage Architecture, Global Shared Storage System, RAID, Hybrid Storage System, Solid State Storage, Flash
PDF Full Text Request
Related items