| With the increasing demands of the embedded applications, the architecture of MPSoC is becoming more complex and the scale of the design is becoming larger, which needs fast and accurate performance estimation techniques. This dissertation starts from the Simulink based MPSoC design flow, explores the related research fields, and proposes a complete MPSoC estimation flow. The experiments of video codec and benchmarks are also given to verify and evaluate the proposed flow. The research work covers the following four aspects:1) Profiling and annotation combined method for multimedia application specific MPSoC performance estimationCurrent techniques are either time consuming or lack of accuracy. In this paper, we solve these problems by presenting a hybrid method for multimedia MPSoC performance estimation. A general coverage analysis tool GNU gcov is employed to profile the execution statistics during the native simulation. To tackle the complexity and to keep the analysis and simulation manageable, the orthogonalization of communication and computation parts is adopted. The estimation result of the computation part is annotated to a transaction accurate model for further analysis, by which a gradual refinement of MPSoC performance estimation is supported. The implementation and its experimental results of QVGA Motion-JPEG and MPEG2prove the feasibility and efficiency of the proposed method.2) Cache modeling for MPSoC performance estimationIn order to obtain accurate enough performance estimation results, the influence of all system components shall be taken into consideration. Cache is the most important one, and it has great impact on the performance of the whole system. The disadvantages of existing cache modeling techniques for MPSoC performance estimation were analyzed, and an static analysis and dynamic annotation combined cache modeling technique for native simulation was proposed. It employs GCC profiling, avoids tag-search for hit/miss judgment, coarsens the granularity of cache updating, and establishes an accurate address mapping table for instruction and all types of data variables, which improves both simulation speed and estimation accuracy. Furthermore, multi-level cache modeling is also considered, which extends support for multi-processor platform. Experimental results show that, in comparison to existing techniques, the proposed technique can significantly reduce the simulation time and improve the accuracy of estimation result.3) Global shared memory access modeling for MPSoC performance estimationThe integration of more processor components brings ever-bigger issue for the global shared memory, which has become the performance bottleneck of the large applications. To combine the advantages of dynamic simulation and static analysis, this dissertation collects the execution statistics by using the GCC profiling tool during the native simulation, and presents an equalized access global memory access model that imitates the contention during the SystemC cycle-accurate simulation to obtain the delay caused by the contention, providing more accurate performance estimation result.4) Performance estimation techniques with MPSoC transaction-accurate modelsTo combine the proposed techniques, this dissertation proposes a performance estimation flow in MPSoC transaction-accurate level, making the poposed techniques apply to wider areas. These techniques have been applied on an H.264decoder application with different hardware architectures. The experimental results show that applying these techniques can obviously improve estimation accuracy of transaction accurate models close to that of the virtual prototype models, with a tolerable overhead on simulation speed. |