Font Size: a A A

The Study Of GPGPU Microarchitecture And Performance Analysis

Posted on:2018-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:Q L XingFull Text:PDF
GTID:2348330515974041Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the past ten years,GPU processing performance growth is very rapid.The GPU is structurally different from the CPU,with more transistors in the GPU for calculation,and more transistors in the CPU for logic control.So their role is different under different design goals.More quickly,GPU quickly from the field of image processing to the general field of computing,which opened a new field called GPGPU(General-Purpose Computing on the Graphic Processing Unit).GPGPU is designed to handle parallel tasks,so the study of parallel computing model is very meaningful.Although classical parallel computing models such as PRAM model,BSP model and log P model have been proposed for many years,the GPGPU structure can be understood more deeply by studying these models.Starting from the concept of GPGPU showing,a lot of research focused on using its powerful computing power to dramatically improve the efficiency of dealing with a single problem.The main reason for this phenomenon is that the detailed structure of the chip,pipeline and storage design are related to trade secrets,it is difficult to obtain such information for research.NVIDIA and AMD are the two major manufacturers of GPGPU,and compared with AMD,NVIDIA's official documents are more detailed.Besides its CUDA suite is more complete,so this article take NVIDIA's chip as a research focus.This paper chooses the open source GPGPU-Sim simulator to simulate the NVIDIA GPU.In this paper,some parallel computing models,such as PRAM model,BSP model and log P model,are compared.The similarities and differences of the parameters and the core ideas are compared,and the current research status of the current GPU is briefly reviewed.Then,this paper gives a new NKGPGPU,the hardware structure,the logical structure of the task,the code structure and the mapping relationship between which made a detailed structure.On the whole,NKGPGPU includes five sub-models,namely hardware structure sub-model,task structure sub-model,task organization sub-model,task execution sub-model and task scheduling sub-model.The hardware structure sub-model mainly gives the main components of the NKGPGPU chip.The task organization sub-model mainly gives the code structure suitable for NKGPGPU and the mapping between code and task.In addition,the starting relation model between tasks is given.Task execution sub-model gives the mapping between code and hardware.The task scheduling submodel gives the mapping of the task topology and hardware structure.At the same time,this paper presents a performance analysis model to make it conform to the proposed NKGPGPU.For the three main aspects that affect the performance of GPGPU: GPGPU pipeline,shared storage and global storage,this paper has carried on the detailed experiment in the case of different thread number.The GPGPU pipeline is mainly to study the different types of instructions for the operation of the cycle of the difference,through this difference to determine the relationship between the instruction and the pipeline.Research on shared memory and global memory is similar to the method,through the continuous access command to complete the test cycle.The proposed model of GPGPU is useful to GPGPU hardware engineers and software programmers,and the experimental methods and ideas for GPGPU-Sim can be used as a basis for further study of GPGPU.
Keywords/Search Tags:GPGPU, Fermi, PTX, Microarchitecture
PDF Full Text Request
Related items