Research On I/O Architecture And Implementing Technology For High Performance Computing

Posted on:2010-09-30

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Q Li

Full Text:PDF

GTID:1118360278456537

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Numerical simulation computing serve as one of the main methods to performscientificresearch and exploration, which poses a tremendous and continuous growingdemand on the computation and data processing capacity of high performancecomputers, has driven the development of scientific computing and parallel computersystem. At present, High Performance Computing has entered an era of Petaflops, andthe storage systems also entered the Petabyte era. The challenges of petascalecomputingondatastoragecapacity, I/Operformance,scalability,reliability, availability,and manageability are tremendous. However, the I/O bottleneck issues obstruct largescale parallel systems to achieve higher efficiency, which happens in two occasions. Inthefirst place, I/O performanceis restricted byfactors such as I/Odevice speedand I/Oarchitectures, which results in I/O and computing speed beingsignificantly unmatched.Secondly, scaling up system size makes disk drivefailure more frequently and longertime to reconstruct the failed drive; in consequence availabilityof I/O system becomesmuchcriticalissue.The effective solutions for the I/O bottleneck can be found from the following sixlevels, including applications, algorithms, languages and compilers, run-time libraries,operating systems, and I/O architecture. Among all the levels mentioned above, I/Oarchitectureisthemostfundamental.Inordertomeet the I/Orequirement andchallenge,alongwithourresearchtask ofa high performance parallel computing system, this paper is presenting our theoreticalstudy of I/O architectures, from which make it possible the high performance andscalability in terms of I/O architecture level. Meantime, I/O implementationmechanisms is focused on this paper, including technologies such as I/O-includedmemory consistency model and its implementation, intelligent I/O control, hybridstorage and transactional storage management, so as to promote I/O performance andavailability.Themainworkandinnovativepointsofthispaperareasfollows.1. I/Orestrictedparallelspeedup modelCurrent parallel I/O performance analysis lacks scientific theoretical models tosupporttheI/Oarchitecturedesign.ThepaperstudiestheimpactofI/Oworkloadonthescalability of parallel computing systems and proposes the I/O restricted parallelspeedupmodel.Basedonthismodel,whichcanbeusedtoguideI/Oarchitecturedesign,a scalable parallel I/O architecture for HPC is presented. Moreover, the paper analyzesseveral strategies for improving the system scalability, which serve as the basis forfurtherstudy.2. I/O-includedgeneralmemoryconsistencymodelandimplementingtechnology As for the consistency problem of shared memory systems with global DMAoperations,thepaperdefinestheconceptof I/O-includedgeneral program.Basedontheconcept, the paper studies the general memory consistency model, builds the generalsequence consistency model, general release consistency model and general scopeconsistency model. Using general scope memory consistency model, the paper designsand implements the CC-NUMA Cache Coherence protocol with global DMA and theglobal shared parallel I/O architecture at the hardware level. The experiment resultsshow that the I/O bandwidth and scalability of the system perform fairly well. Theactual parallel I/O bandwidth reaches 20.2 GB/s, and scales well with the number ofsystemprocesses.3. IntelligentI/OschedulealgorithmbasedonreinforcementlearningTo improve the I/O service efficiency of RAID and optimize the I/O performanceof parallel applications, the paper presents an intelligent I/O schedule algorithm,RL-scheduler, in RAID controllers based on reinforcement learning. RL-schedulerutilizes Q-learning strategy to implement a self-control and self-optimizing scheduler.The algorithm leverages the scheduling equity, disk seeking time and the I/O accessefficiency of MPI applications. Furthermore, the proposed interleaving organization ofmultiple Q-tables improves the efficiency of the Q-table updating. The experimentresults show that, on a large-scale parallel system with multiple parallel applications,RL-scheduler shortens the average I/O waiting time of parallel applicationsconsiderably. Thus increases the practical I/O bandwidth, and improves the systems'scalability.4. Hybrid storagemanagement algorithmtosupport transactionsemanticsTo address the requirement and challenge posed by HPC, the paper combines theidea of transactional storage management and hybrid storage acceleration, andintroduces an electro-magnetic hybrid storage management algorithm to supporttransaction semantics. A token-based protocol is designed to cope with the conflictsbetween I/O transactions and an adaptive logical partition algorithm is proposed tomanage Solid State Disk (SSD) storage. Simulation results show that theelectro-magnetic hybrid storage system can deal with transactions with varied accesspattern elegantly and effectively improve SSD hit rate, hide the overhead of versionmanagement and conflict detection. Both the I/O performance and availability aresignificantlyimproved.

Keywords/Search Tags:

High Performance Computing, I/O Architecture, General MemoryConsistency Model, Global Shared I/O, Intelligent I/O Control, Hybrid StorageSystem, TransactionalStorage Management

PDF Full Text Request

Related items

1	Genetic Algorithm In Financial High Performance Computing
2	Naplus: A Software Shared Memory For Virtual Clusters
3	Design And Implementation On Power Management Techniques For High-performance Computing System
4	The Research And Development Of The Point Based Global Illumination Algorithm On The Intel MIC Architecture
5	Research On High Performance Parallel Computing Architecture Based On FPGA+DSP
6	Research On General Purpose GPU Computing Technology In The High Performance Computing Platform
7	Computation Model And Performance Optimization On Shared Memory Architecture
8	A Traffic Aware Hybrid Write Buffer In High Performance Computing
9	Research On Converged Data Management Techniques For High-performance Computing Systems
10	Research On Large Scale Parallel Storage Systems For Super Computing