Font Size: a A A

Research On Instruction Dynamic Mapping Algorithm Of EDGE Architecture

Posted on:2013-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:J GaoFull Text:PDF
GTID:2268330392968735Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
Monolithic structure commonly used in out-of-order superscalar processorshas severly limited performance improvement of microprocessor. EDGE, as one ofthe models used to cope with the bottleneck in the performance improvement ofmicroprocessor, aborts the monolithic power hungry and unscalable structure in itsarchitecture model. In distributed EDGE architecture, instructions are mapped toexecute in several tiles at the same time. The operand communication among tilesneeds delay and leads performance degradation. The instruction mapping algorithmtries to mitigate the performance loss due to operands communication delay bycarefully balancing the communication delay among tiles and degree of parallelism.In TRIPS microprocessor, critical resources scatter asymmetric in topologyand static instruction mapping algorithm is used. This will lead unbalance in load atETs and hot spot in operand communication network, which will result inperformance degradation.In this paper, an EDGE architecture like TRIPS is implemented to studyinstruction dynamic Deep mapping algorithm in the M5-EDGE simulator. Resultshow that Deep mapping with round-robin fashion at choosing ET with issue width1and2, the performance is85%and98.3%compared with SPDI, without compiler’sschedule and optmization. When take the RT/DTs’ topology location intoconsideration, choosing ET using numbering sequence, zigzag sequence andcalculating global communication hops of a hyperblock to choose a tile asoptimizations. Average hops are decreased by2.63%、2.18%and4.70%respectively, and IPC are improved by1.07%、1.21%and2.11%respectively,compared with the base Deep mapping algorithm at the issue width1.Optimizations which decrease communication hops of instructions can improve IPCnotably.Over90%operands are delivered by local bypass path in Deep mapping,which largely alleviate load of OPN. Simulation shows that when the bypass widthis2fold of the issue width, the delay of local operands bypass is nearly0. Byincreasing width of local bypass path, latency of operand delivery can be decreasedeffectively. When put RTs into ETs according to register number, IPC gained in base Deepmapping algorithm is improved by1.77%. Taking DTs’ location into considerationas optimizations, ETs close to DTs are preferentially selected and calculatehyperblock’s communication hops to select a proper ET. These optimizationsimprove IPC by1.17%and1.89%compared to base Deep algorithm. When RTsand DTs are distributed into ETs, a4x4grid topology is obtained. IPC gained byDeep mapping algorithm is97.18%and113.42%with issue width1and2,compared SPDI. A simple optimization above Deep, these comparisons is97.32%and114.06%. IPC will be improved notably when topology hops decrease becauseof micro-architecture changes or optimizations over Deep mapping algorithm.
Keywords/Search Tags:EDGE architecture, dynamic instruction mapping, performance analysis
PDF Full Text Request
Related items