| As one of the most important tools of architecture research,simulation has been widely used to evaluate new architecture designs.Compared to hardware implementation,simulation is more flexible and lower cost,but has extremely slower execution speed,which significantly affects its effectiveness and feasibility.Hence the simulation acceleration has always been a hotspot in architecture research.Sampling-based simulation optimization is one of the most effective simulation acceleration techniques due to its easy-to-implement and notable effect.The basic idea of sampling-based simulation technique is to estimate the overall metrics from the sampled results,where only selected samples are simulated in detail.Prior researches mainly focus on how to design effective measures to character samples' behaviors and use these measures to pick high quality samples.However,few researches addressed another critical design issue,sample granularity.Most of researchers intuitively prefer the fine-grained design that can directly shorten the size of individual samples.It is unknown whether is feasible to design a better sampling scheme by choosing the coarse-grained design or apply current measures to character the coarse-grained samples.This paper finds that the coarse-grained design is actually better than finegrained design after a comprehensive study of sampling-based simulation processes.The advantages of the coarse-grained design are shown as follows:(?) Suitable coarse-grained sampling can effectively filter unnecessary noises and preserve essential characteristics so that characteristics analysis and sample selection can be significantly improved.It can effectively reduce the functional simulation time,which has become the most time-dominant portion in a fine-grained sampling-based simulation method.(?) A hierarchical sampling can be applied to the selected coarse-grained sample to further reduce the simulation time.Through selecting finer-grained simulation points within coarser-grained phases,we could gain the advantages of both phase granularities.Based on these observations,this paper designs and implements a coarse-grained sampling scheme that partitions the execution flow into coarse-grained samples and combines both the instruction flow measures and the new introduced data flow measures to select high quality samples.Experiment results show our approach can effectively reduce simulation time while achieving comparable accuracy with mainstream sampling schemes.In comparative experiments on SPEC2000 with the SimPoint approach using 10M fixed-length intervals,it achieves an average speedup of 4.11X,and this speedup increases to 8.17X when the hierarchical sampling is used. |