Font Size: a A A

The Research On Hardware/Software Partitioning For Reconfigurable System

Posted on:2014-11-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:X X NiuFull Text:PDF
GTID:1318330518471255Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The FPGA-based RCS(reconfigurable computing systems)support both performance of FPGA(Field-Programmable Gate Array)and flexibility of GPP(General Purpose Processor)for an application,and are widely used in HPC(High Performance Computing).A good software/hardware partitioning algorithm can divide the application onto GPPs and FPGAs automatically and effectively,and maximize the computing model of the two arithmetic components.Therefore the researching on software/hardware partitioning is the hot issue in these RCS area.Seen in reviewing related studies on home and abroad,although a number of research goals have been got from field of the hardware software partitioning,there still exist some problems.As the research foundation of the existing research results,we define the framework of system level hardware/software partitioning,whose target system architecture is FPGA-based reconfigurable computing systems,and the optimization objective is the overall system performance,at the same time the constraints are the size of FPGA.The proposed framework includes three functional modules.In every module,the estimation of the cost running in the CPU/FPGA,the partitioning algorithms are intensively studied in this dissertation.The object of the framework is to determine the software implementation or hardware implementation of multi-version implementation.The researches of the dissertation are:The loop structure is always considered as the main time-consuming part in most computationally intensive applications.Since the FPGA-based reconfigurable computing systems emerge in recent years,the static techniques for analyzing loop structures are not able to meet the requirement of specific optimization according to the current behavior of programs.To address the lack of directly accessing the run-time information by using the dynamic techniques for analyzing loops,a new loop-analysis method is proposed.In this method which is implemented on the LLVM,the loop structures are recognized according to the dominating relationship,then the result of the edge profiling before the frequency of loop-calling,the average frequency of iteration and time of running are calculated.Experimental results manifest that the proposed method can recognize all the loop structure and collect the loop run-time information accurately,which can support hardware/software partitioning work of reconfigurable computing.In the process of design of RCS,the estimation technique obtaining the hardware delay/area is a fast and feasible method.However,the existing high-level estimation techniques of hardware delay/area relegated for the specific hardware implementing environment(for example the FPGA and the properties of used tools)and were poor in commonality.In addition,the existing techniques were short of estimating the multi-version implementation of loops due to the compile techiniques.For the poor commonality,bring in the feedback mechanism,firstly according to the operation expressions,combining the hardware circuit of operations,our elicit formulae unrelated with the implementing environment.Then our make use of the feedback information to amend the formulae so as to make them possible for the specific environment.For the short of estimating the multi-version implementation of loops,our proposed a uniform interface based on the operations targeted at multi-version,comining to the amended formulae,designed an estimation algorithm.The proposed estimation method can estimate the hardware delay/area of different program in FPGA,and can obtain the hardware delay/area of multi-version of programs which can support hardware/software partitioning and the hardware multi-version exploration.The previous partitioning approaches assume that each application region has only a single hardware implementation.However,an application region typically can be implemented as many different versions.In addition,the communication overhead is the performance bottleneck of the FPGA-based reconfigurable computing system.According to the above information,our make up a hardware software partitioning model with hardware multi-version characteristic,develop the clustering algorithm based on the communication overhead for the loop structure,then amend the objecting function of the partitioning model according to the result of the clustering,in the last use the generic algorithm to solve the partitioning problem and the hardware multi-version exploration and the choosing of the clustering algorithm.By this algorithm,our can determine the software implementation or hardware implementation of multi-version implementations,and then improve the partitioning quality form the points of the overall optimization.The result of the last experiments showed that genetic algorithm can solve the software hardware partitioning problem with multi-version exploration and partitioning grain re-selection.But with the increasing of the size of the experiment,genetic algorithm performs worse because of its weak ability on local searching,which depends on the mutation operator.Traditionally,the mutation operator was designed using random policy,and easily destroyed the good models of the chromes.An improved Genetic Algorithm is proposed by this dissertation,In accordance with hardware performance/area switching characteristic,the mutation operator is chosen adaptively based on the Q-learning algorithm and greedy algorithm.The objective is to avoid the blindness in mutation.The experiments show it is very efficient compared to standard Genetic Algorithm in terms of quality of search and convergence,then improve the capability for the local search performance and further improving the hardware software partitioning quality.
Keywords/Search Tags:Reconfigurable computing system, FPGA, Software hardware partitioning, Performance estimation, Genetic algorithm, Q-learning algorithm
PDF Full Text Request
Related items