Font Size: a A A

Modeling Of High Performance Computing On Many-core Processors

Posted on:2021-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:B W KanFull Text:PDF
GTID:2428330629452665Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
High performance computing(HPC)is a parallel computing paradigm where multiple computing resources cooperate to complete one task so as to improve the execution speed.State-of-the-art many-core processors for HPC,such as NVIDIA GPU,Intel KNL,IBM Power series as well as Chinese Sunway and Chinese Phytium-Matrix,are divergent in computing power but similar in architectural concepts.They are all designed following the rule of “divide and conquer”,means to partition a heavy job into large number of tasks which is respectively coped with by a tiny sub-core.As a dual,the hardware is of layered architecture.Parallel algorithm is not a unity of multiple serial algorithms.A scientific model is crucial for leading the mapping from parallel computing to parallel hardware requirements,namely the parallel computing model.Deeply matching of an algorithm on a model takes full advantages while optimizing calculation process,or increases the time and space complexity otherwise.Since the origin of parallel computing,the literature has been expecting a generalized model for parallel computing.In classical models including PRAM,logP,C3,Hypercube,BSP model,Multi-BSP model,etc.,scheduling factors have never been taken into consideration.This paper proposes a tree structured scheduling model for parallel computing on many-core architectures.It's a generalized model of depicting diverted many-core processors by adapting its parameter settings.Moreover,we summarize the instructions into three types: computation,communication,and scheduling ones.The execution process of a task can be expressed by these classified instructions according to the model.We also make a detailed analysis on the conception of scheduling instructions,and illustrate their functioning by PTX,an assembly-like language.This paper describes the logic model of the tree structured scheduling model,and quantifies its performance by mathematical reasoning.Extensive efforts are dedicated in revealing the scheduling of the nodes,the division of the tasks,the communication and transmission,in the process of task execution.We also set up the time cost model where the whole job is partitioned into three major steps: task distribution,task calculation and task feedback.We then apply the proposed model to several representative many-core architectures.Four classical parallel algorithms are selected for theoretical derivation then experimental verification,they are matrix multiplication,prefix summation,odd-even sorting,and FFT.Experimental results matches the theoretical derivation well,which infers the effectiveness and generality of the model proposed.
Keywords/Search Tags:high performance computing, parallel computing, many-core processor, tree structured model, parallel algorithm
PDF Full Text Request
Related items