Font Size: a A A

Architectural and compiler optimization for network processors

Posted on:2008-02-26Degree:Ph.DType:Dissertation
University:University of California, RiversideCandidate:Yu, JiaFull Text:PDF
GTID:1448390005977406Subject:Computer Science
Abstract/Summary:
In today's capital constrained environment, routers now must support continually evolving requirements on aggregating a range of network protocols and traffic types. To meet such requirement, network processors (NPs) have emerged as a new class of programmable processors for network applications. New generation NPs offer high performance through parallel processing architecture, which incorporates multiple processing elements (PEs) configured as either independent or pipelined units. Being programmable, NPs support new applications with improved time to market at lower cost.; Performance and power consumption are two primary concerns for NPs. However, to conduct research for high performance and low power, we should find an efficient processor simulation method. The existing cycle-accurate simulators faithfully model the micro archi tecture, but they take extremely long time to simulate. The slow speed makes cycle accurate simulation impractical for architectural evaluation. We thus propose a statistical input sampling algorithm that extracts representative traffic patterns from traffic traces. With shortened traffic trace, the simulation can finish in a short time with sufficient accuracy.; With the proposed fast simulation method, we further exploit the power saving opportunities in NPs. Traffic variation is typically observed on routers. This traffic variation implies that much less processing power is required during light workload. A number of low power techniques, Dynamic Voltage Scaling (DVS), Clock Gating and Power Gating are studied in NP system. These techniques are shown to be very effective in reducing power in NP system.; To increase throughput of NPs, compiler optimization plays a critical role in finding a good program mapping. We propose to use a divide-and-conquer approach to recursively bipartition the task graph and allocate processing elements (PEs). The recursive min-cut algorithm can find the optimal number of pipeline stages with minimum communication overhead. Then we propose a hierarchical refinement method to migrate tasks from bottleneck stage to non-bottleneck stages. The experiment shows that our algorithm achieves averagely 18% throughput improvement than previous compiler algorithms.
Keywords/Search Tags:Network, Compiler
Related items