Font Size: a A A

Research On A Comprehensive Memory Management Framework For CPU-FPGA HMPSoCs

Posted on:2022-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:M LinFull Text:PDF
GTID:2518306314462624Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of artificial intelligence and mobile Internet technology,emerging applications such as artificial intelligence have gradually evolved from traditional data centers to edge mobile devices,and various smart glasses,smart watches,and smart terminals have gradually entered our daily life.These emerging applications present the characteristics of large data volume and high parallel computing,and have relatively high requirements for computing resources and storage resources.Different from the traditional data center situation,subject to low power consumption and high mobility,the computing resources and storage resources of edge mobile devices are very limited,especially some shared resources are usually limited,such as shared storage resources.Different computing components of edge computing devices will compete on these limited shared resources,and the competition of these shared storage resources often leads to a decrease in overall system performance.At the same time,for the rapidly increasing computing requirements and the amount of data that needs to be processed,and with the advent of the Post-Moore law period,traditional edge devices with CPUs as the main computing core can no longer meet the computing needs of emerging applications.Various Heterogeneous Multiprocessor System-on-Chip(HMPSoCs)have gradually become the mainstream of edge computing devices.At present,the mainstream and common heterogeneous multi-processor system-on-chip devices include NVIDIA's CPU-GPU HMPSoC s and XILINX's CPU-FPGA HMPSoCs.The applications of these chips in edge computing,auto-driving and other fields have made gratifying progress.Meanwhile,making full use of the computing advantages of various types of processors has become a hot topic in the current architecture field.How to make the CPU and GPU,CPU and FPGA give full play to their respective computing advantages,and rationally allocate the shared resources on the chip(for example,shared storage resources,shared bus resources,shared IO,etc.),and becomes a key factor to unleash the computing advantages of heterogeneous multi-processor system-on-chips.In this paper,we observed the shared memory resource conflicts on the XILINX commercial CPU-FPGA heterogeneous multi-processor system-on-chip platform by observing the parallel execution of programs on the CPU and FPGA accelerator.We found that shared storage resource conflicts occur simultaneously on the shared last-level cache(Last-level Cache,LLC)and main memory.For this kind of cross-level storage resource conflict,it is necessary to design a cross-level memory management strategy to alleviate memory resource conflicts.We first generated the memory access Trace of the CPU and FPGA programs,and obtained the memory access characteristics of each array of the CPU and FPGA program through the cache model.According to the memory access characteristics of the CPU and FPGA programs,we proposed a simulated-annealing-based comprehensive memory management framework,which derives the optimal data placement scheme and cache allocation scheme.The experiment results conducted on the Xilinx ZYNQ7020 HMPSoC commercial platform show that our proposed memory management framework has an average 14.89%and 10.11%overall system performance improvement w.r.t a baseline greedy-based and the state-of-the-art FPGA-only data placement strategies,respectively.In addition,in order to increase the memory access load of the test set,the innermost loop of Poly Benchmark test set is loop unrolled with an unrolling factor of 4.The experiment results that our proposed memory management framework has an average 37.87%and 33.55%overall system performance improvement w.r.t a baseline greedy-based strategies and the state-of-the-art FPGA-centric data placement strategies,respectively.
Keywords/Search Tags:Field Programmable Gate Array, Heterogeneous Multiprocessor System-on-Chip, memory management, performance optimization, share resource conflict
PDF Full Text Request
Related items