Font Size: a A A

Research On I/O Architecture And Implementing Technology For High Performance Computing

Posted on:2010-09-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q LiFull Text:PDF
GTID:1118360278456537Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Numerical simulation computing serve as one of the main methods to performscientificresearch and exploration, which poses a tremendous and continuous growingdemand on the computation and data processing capacity of high performancecomputers, has driven the development of scientific computing and parallel computersystem. At present, High Performance Computing has entered an era of Petaflops, andthe storage systems also entered the Petabyte era. The challenges of petascalecomputingondatastoragecapacity, I/Operformance,scalability,reliability, availability,and manageability are tremendous. However, the I/O bottleneck issues obstruct largescale parallel systems to achieve higher efficiency, which happens in two occasions. Inthefirst place, I/O performanceis restricted byfactors such as I/Odevice speedand I/Oarchitectures, which results in I/O and computing speed beingsignificantly unmatched.Secondly, scaling up system size makes disk drivefailure more frequently and longertime to reconstruct the failed drive; in consequence availabilityof I/O system becomesmuchcriticalissue.The effective solutions for the I/O bottleneck can be found from the following sixlevels, including applications, algorithms, languages and compilers, run-time libraries,operating systems, and I/O architecture. Among all the levels mentioned above, I/Oarchitectureisthemostfundamental.Inordertomeet the I/Orequirement andchallenge,alongwithourresearchtask ofa high performance parallel computing system, this paper is presenting our theoreticalstudy of I/O architectures, from which make it possible the high performance andscalability in terms of I/O architecture level. Meantime, I/O implementationmechanisms is focused on this paper, including technologies such as I/O-includedmemory consistency model and its implementation, intelligent I/O control, hybridstorage and transactional storage management, so as to promote I/O performance andavailability.Themainworkandinnovativepointsofthispaperareasfollows.1. I/Orestrictedparallelspeedup modelCurrent parallel I/O performance analysis lacks scientific theoretical models tosupporttheI/Oarchitecturedesign.ThepaperstudiestheimpactofI/Oworkloadonthescalability of parallel computing systems and proposes the I/O restricted parallelspeedupmodel.Basedonthismodel,whichcanbeusedtoguideI/Oarchitecturedesign,a scalable parallel I/O architecture for HPC is presented. Moreover, the paper analyzesseveral strategies for improving the system scalability, which serve as the basis forfurtherstudy.2. I/O-includedgeneralmemoryconsistencymodelandimplementingtechnology As for the consistency problem of shared memory systems with global DMAoperations,thepaperdefinestheconceptof I/O-includedgeneral program.Basedontheconcept, the paper studies the general memory consistency model, builds the generalsequence consistency model, general release consistency model and general scopeconsistency model. Using general scope memory consistency model, the paper designsand implements the CC-NUMA Cache Coherence protocol with global DMA and theglobal shared parallel I/O architecture at the hardware level. The experiment resultsshow that the I/O bandwidth and scalability of the system perform fairly well. Theactual parallel I/O bandwidth reaches 20.2 GB/s, and scales well with the number ofsystemprocesses.3. IntelligentI/OschedulealgorithmbasedonreinforcementlearningTo improve the I/O service efficiency of RAID and optimize the I/O performanceof parallel applications, the paper presents an intelligent I/O schedule algorithm,RL-scheduler, in RAID controllers based on reinforcement learning. RL-schedulerutilizes Q-learning strategy to implement a self-control and self-optimizing scheduler.The algorithm leverages the scheduling equity, disk seeking time and the I/O accessefficiency of MPI applications. Furthermore, the proposed interleaving organization ofmultiple Q-tables improves the efficiency of the Q-table updating. The experimentresults show that, on a large-scale parallel system with multiple parallel applications,RL-scheduler shortens the average I/O waiting time of parallel applicationsconsiderably. Thus increases the practical I/O bandwidth, and improves the systems'scalability.4. Hybrid storagemanagement algorithmtosupport transactionsemanticsTo address the requirement and challenge posed by HPC, the paper combines theidea of transactional storage management and hybrid storage acceleration, andintroduces an electro-magnetic hybrid storage management algorithm to supporttransaction semantics. A token-based protocol is designed to cope with the conflictsbetween I/O transactions and an adaptive logical partition algorithm is proposed tomanage Solid State Disk (SSD) storage. Simulation results show that theelectro-magnetic hybrid storage system can deal with transactions with varied accesspattern elegantly and effectively improve SSD hit rate, hide the overhead of versionmanagement and conflict detection. Both the I/O performance and availability aresignificantlyimproved.
Keywords/Search Tags:High Performance Computing, I/O Architecture, General MemoryConsistency Model, Global Shared I/O, Intelligent I/O Control, Hybrid StorageSystem, TransactionalStorage Management
PDF Full Text Request
Related items