Architectural enhancements for efficient operand transport in multimedia systems

Posted on:2008-12-29

Degree:Ph.D

Type:Dissertation

University:Georgia Institute of Technology

Candidate:Kim, Hongkyu

Full Text:PDF

GTID:1448390005970773

Subject:Engineering

Abstract/Summary:

Multimedia applications pose new challenges to computer architecture. Their tremendous communication demands severely burden the interconnect between functional units, which has become a bottleneck in high performance architectures. This dissertation addresses the critical challenge in multimedia processors: to efficiently transport operands among computational and storage components. It provides architectural enhancements that enable the high bandwidth, low latency communication demanded by multimedia applications.; This research analyzes multimedia workloads to characterize the communication patterns that occur in the execution of standard multimedia benchmarks. This empirical analysis indicates that most operands exhibit strong locality, enabling several optimizations of transport mechanisms, particularly to operand transport networks, storage structures, and instruction steering algorithms. This empirical study shows that an eight-entry local buffer with approximate information on operand lifetime is sufficient to suppress 81% of operand writes. In addition, chaining selected pairs of FUs based on producer-consumer information allows 50% of reads to be accessed through the shortest path.; These results guide the design and development of two efficient operand transport mechanisms: (i) a traffic-driven operand bypass network and (ii) a dynamic instruction clustering. The traffic-driven operand bypass network is designed using a novel, systematic design customization process for wide-issue architectures. It is driven by a technology model-based evaluation methodology on different execution engines, resulting in a low cost, high performance bypass network targeted for multimedia applications. This technique places microarchitectural components exploiting the transport communication patterns, reorganizes each of the bypass paths based on the traffic rate, and maps inter-instruction communication on the local paths. The reduction in operand transport latency combined with a faster clock cycle achieves an instruction throughput gain of 2.9x over the broadcast bypass network at 45nm. In addition, the instruction throughput gain over a typical clustered architecture is 1.3x.; Dynamic instruction clustering groups dependent instructions into clusters during instruction execution, detects the operand lifetime, performs intra- and inter-cluster operand transport pattern analysis, and maps the clustered instructions to an efficient cluster execution unit. Two cluster execution unit implementations are explored: network ALUs and a dynamically-scheduled SIMD PE array. In the network ALUs, intermediate values within the inner loops are propagated among ALUs without distribution through global bypass buses. The reduction in operand transport latency results in a 35% IPC speedup over a conventional ILP processor. The dynamically-scheduled SIMD PE array supports DLP processing of the innermost loops in image processing applications. Dataparallel operations combined with localized operand communication produce an IPC speedup of 2.59x over a 16-way, four-clustered microarchitecture.

Keywords/Search Tags:

Operand, Multimedia, Communication, Applications, Bypass network, Efficient, Over

Related items

1	The Design Of Floating-Point Multiply-Add Fused Units In General Purpose Processors
2	Design And Implementation Of Efficient Wireless CNC Network Audit System Based On Kernel Bypass
3	Research On Multimedia Intelligent Terminal Platform For Big Data Applications
4	An integrated communication architecture for distributed multimedia applications
5	Communication-efficient multiparty oblivious transfer and applications
6	The Research And Design Of Bypass Network Equipment Protection System
7	Research For Multimedia Multicasting Mechanism On Partial-multicast Network
8	Efficient MAC protocols for streaming data transmission over wireless networks
9	Multimedia Communications Encryption System And Implementation Of The H.323 Network
10	Multimedia network synchronization in real-time applications