YHFT-DX is a high-performance quad-core DSP (Digital Signal Processor) chipdesign in65nm process. It is required to work at500MHz under the worst conditions.As the SM(shared memory)is the center of the storage path in the chip, the effect ofSM is significant to design. This paper focuses the optimization techniques used in thephysical design of YHFT-DX SM, the main contribution is divided into the followingsections:1) In the placement and routing optimization, based on the SM architecture,hierarchical physical design method is used to optimize the bottom and the top,respectively. In the placement optimization of the bottom bank module, the critical pathdelay is shorten by the manual placement of registers, this also diminish the complexityof the clock tree structure, resulting in40.3%improvement in terms of wire routing. Inthe placement optimization of the top, the routing channels are designed to theproportional of the number of traces so as to increase routing resources, and to solve therouting congestion. In the design of the power and ground distribution network,according to the demand of routing resources, regions are divided with different densityof power and ground stripes in horizontal and vertical direction, and that of via arraybetween the metal wires. While routing resources are saved, IR drop is still controlledless than2.5%.2) In the design of the clock tree, according to the characteristics of the SM timingpaths and the distribution of SRAM, as well as considering the data path delay of thevarious types of timing paths, the positive clock skew is used to resolve the setup timeviolations, and the negative clock skew is used to solve the hold time violations. Themaximum positive clock skew reaches0.294ns and the maximum negative clock skewis0.195ns. Utilization of clock skew played a key role, which is a major feature of thephysical design of SM. In the implement of the clock tree, using the method of manualdesign and automatic design, the clock subtrees are constructed. Using a variety ofrouting methods to eliminate the crosstalk of the clock lines, which ensure the quality ofthe clock signal.3) In the data path optimization, we aim mainly at long interconnectionoptimization and bus architectural optimization. In long interconnect optimization, across-improved inverter chain is inserted to improve the critical path delay of the datapath and line crosstalk delay by0.177ns. In bus architectural optimization, a three-statebus structure is used. The delay reduces0.049ns compared with the mux selection busstructure.In all, for a critical path timing optimization,0.294ns clock skew is utilized on theclock path,0.177ns data path delay is reduced in the optimization of long interconnection lines,0.471ns is optimized in the full path. The design goal of chipworking at500MHz under the worst conditions in65nm’s process is reached,the chiphas already taped-out. |