| With the development of computer technology,embedded systems have been widely used in fields such as image processing and industrial control.The embedded system based on heterogeneous computing can improve the computing capabilities and compromise the performance and power consumption.Heterogeneous computing,which is composed of multi-core CPU,mobile GPU and FPGA(Field Programmable Gate Array),is an important embedded hardware platform.Due to the increasing complexity of systems,heterogeneous computing has become an important method to expand the application field of embedded systems.However,embedded systems based on heterogeneous computing,especially the FPGA-GPU-CPU platform,have rich types of resources and different programming models.Therefore,there are many challenging issues worthy of research.First,algorithms for efficiently mapping computing tasks to hardware resources.At present,the core idea of heterogeneous computing development is to partition the complete computing task into small workloads so that the partitioned subtasks can be mapped to heterogeneous accelerators respectively.The research on partitioning can be classified into two categories-data partitioning and task partitioning.Both strategies have their advantages and disadvantages.The data partitioning strategy is beneficial to achieve workload balance among heterogeneous accelerators,and the task partitioning shows better adaptability in many application scenarios.Secondly,various heterogeneous accelerators have their own design models,and there are huge differences in the existing programming models of different platforms.Although OpenCL promotes the portability of programming language level,the OpenCL tool chain provided by each accelerator manufacturer is exclusive.In addition,the mapping algorithm dynamically adjusts the granularity of task partitioning,which brings the problem of compilation space explosion.In order to solve the above problems,this thesis explores the key technologies such as abstract layer and implementation layer in the hierarchical model of heterogeneous computing system,and constructs the framework of heterogeneous computing system design with the goal of improving the applicability of the platform,improving the efficiency of resource mapping and communication,and lowering the design and programming threshold of the platform.The main research contents and innovations of this thesis include the following aspects:1.Abstract modeling method of heterogeneous computing system.In view of the large difference of computing resources in the CPU-FPGA-GPU collaborative heterogeneous system,this thesis proposes a modeling method based on SEFM(Servant&Execution-flow Model),the model extension,and the application feature analysis methods.The heterogeneous DNN acceleration case is developed by using the model.The three common networks obtain higher inference accuracy while using less FPGA resources.2.Efficient servants communication and synchronization mechanism.This thesis integrates the OpenCL runtime environment of various vendors to launch a complete development framework,which promotes the implementation of heterogeneous multi-accelerator collaboration in OpenCL cross-platform languages,and implements the execution flow and flow-lead-in mechanism.The communication optimization methods such as double buffer,compound servants and multichannel memory are proposed to improve the throughput of the traditional shared memory mechanism.3.Servants mapping algorithm based on uniform pipeline.The partitioning method of computing tasks combines the advantages of traditional data partitioning and task partitioning,and adopts a hybrid partitioning strategy to map the multiple tasks to heterogeneous resources,which can keep the balance of the pipeline of the servants,and reduce the resource occupation while the system achieves the performance goal.At the same time,the parameterized instance method can cooperate with the mapping algorithm to explore the hardware configuration suitable for the target application.The design method changes from traditional static compilation to dynamic online execution,which avoids the explosion of compilation space. |