Calibrative source-level multi-target performance estimation

Posted on:2015-01-15

Degree:M.S

Type:Thesis

University:Northeastern University

Candidate:Moazzemi, Kasra

Full Text:PDF

GTID:2478390017490446

Subject:Engineering

Abstract/Summary:

The growing system complexity and decrease in time-to-market make designing a system more and more challenging. In order to solve these challenges, trends are emerging to move the design process to higher levels of abstraction. Using a systematic top-down approach from system-level specification down to implementation enables finding globally efficient solutions and accelerates the path to implementation. Designing at system-level requires methods that enable designers to explore the large design space in a rapid but sufficiently accurate fashion to compare the large amount of possible choices at this stage.;Design space exploration (DSE) is an essential aspect of System-level design. During DSE many different design options (e.g. allocation of processors and mapping of behaviors to them) need to be evaluated and compared in their performance. Efficiency of performance estimation method plays a vital role in DSE. Many approaches have been proposed to address the challenge of rapid early estimation. One approach is retargetable profiling [2] which proposes a fast estimation and efficient exploration approach to be integrated into the system design flow. This method relies on performance cost of each Processing Element (PE), which is manually captured in a weight table. The performance cost is expressed as execution cycles for each type of operation (e.g. addition, multiply) and each data type. It needs to be manually captured by the database designer, typically based on the PE's datasheet. In the next stage, an application for performance estimation is simulated and profiled once to obtain the computation demand. Finally, to obtain the estimated execution time for a particular PE, the computation demand is multiplied with the weight table and accumulated. This method delivers very fast performance estimation results, allowing for a rapid design space exploration. However, this method has some weak points like cumbersome ix labor-intense definition of weight tables, low flexibility to PE configuration and depleted correlation between estimation and real performance for processing elements.;This thesis introduces a framework for automatically determining the weight tables which capture processor characteristics. In result, it simplifies retargetable profiling, increases estimation accuracy, and enables evaluating hardware and software configurations. In particular, the contributions of this work are a comprehensive framework for running benchmarks on different target platforms extracting the execution times metrics. We make a Linear programming formulation that identifies weight table parameters subject to minimizing the overall error over the evaluated benchmarks.;Source-level estimation inherently cannot capture all aspects of the processor (such as hiding memory accesses through register reuse). However, it is difficult to quantify how many characteristics are unknown or not observable. To gain more insight, we propose a synthetic performance model for both processor and benchmarks, which allows us to analyze the impact of unknown or non observable features.;The efficiency of the proposed framework was evaluated with 85 benchmarks on 3 different processing platforms (Cortex-A9, Arm9, Bfin527). With the weight tables automatically gathered through the proposed framework, the retargetable source level profiling is now able to better show the difference between software configurations (i.e., compiler optimizations) as well as hardware configurations (i.e., memory hierarchy). Relative comparison between processing elements is correct in 88 - 100% of estimations. Each retargeting estimation is made rapidly in less than 10 milliseconds regardless of size of the benchmark and complexity of the processor. Over previously manually filled weight tables, our approach reduces the estimation error from 73% to 36%. Our simulation model shows that if majority of the functional characteristics are non-observable or unknown even with an infinite number of benchmarks, accuracy can not be quantify.

Keywords/Search Tags:

Estimation, Performance, Benchmarks, Weight tables

Related items

1	Performance comparison by running benchmarks on Hadoop, Spark, and HAMR
2	Estimation of clock parameters and performance benchmarks for synchronization in wireless sensor networks
3	Performance benchmarks for passive UHF RFID tags
4	Non-contact Estimation Of Human Height And Weight Based On Multi-stage Neural Network And A Single RGB-D Images
5	The Design And Implementation Of A Graphics Performance Benchmarks System
6	Multi-dimensional analytical benchmarks for neutral particle transport
7	Statistical tools for disclosure limitation in multi-way contingency tables
8	Discovering Relations Between Web Tables
9	Research Of Performance Estimation Of Channel Coding
10	NPB Performance Evaluation Of Tera-Scale Clusters And Implementation Of Parallel Non-Numerical Algorithm With Performance Analysis