Font Size: a A A

Research On New High-radix Interconnect Chip Architecture And Key Technologies For Large-scale High-performance Computing

Posted on:2019-04-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:S XuFull Text:PDF
GTID:1368330545473658Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Compensation management is an important part of modern human resources manBuilding an Exascale supercomputing system with current high-speed interconnect technology will face many challenges,such as unacceptable system power consumption,difficulty in designing network topology,significantly increased transmission delays,unacceptable system reliability,and increasing density of interconnected network engineering.Therefore,it is necessary to explore novel interconnect technology,including higher radix routers,converged interconnection architectures,and new-type photoelectric interconnection switching technologies,to effectively improve the performance of the interconnection network,reduce the power consumption of the interconnection system,and improve the reliability and extensibility of the entire system.This paper analyzes the status quo of high-performance interconnection networks and related key technologies,and focuses on four aspects of optical switching technology,including high-level interconnect routing switch chip architecture and key fault-tolerant technologies,high-speed communication interface optimization technology for memory interconnect network architecture,and next-generation 100 Gbps optical serial interface transceiver technology,and scalable optical switching technology,in order to achieve key technological breakthroughs,ease the communication walls of Exascale systems,and effectively support Exascale applications.The main contributions and innovations are as follows:(1)High-level interconnect routing switch chip architecture and key fault-tolerant technologiesAiming to solve the problems faced by higher-order router architectures,including large hardware complexity,weak scalability,limited buffer resources,and poor system robustness,an Aggregated-tile Based Router Architecture(ATR)is proposed,meanwhile,a tile performance analysis and optimization method based on the M/D/1 queuing theory model is proposed,which can reduce the storage overhead and global bus overhead of the 64-order routing switch chip by 40% to 50%,and achieve about 98% saturation throughput and exchange delay performance of the YARC structure.Based on the aggregated switching fabric structure,a fair wavefront arbitration scheduling algorithm for high-order switching is designed in the crossbar scheduling.It achieves fast timing,high throughput,and fair arbitration with less overhead.Compared with traditional DRRM algorithm,it reduces the average scheduling time and average response time of packets by about 15% and 21%.In terms of protocols and flow control mechanisms,distributed hierarchical routing and dynamic multi-queue flow control mechanisms are proposed to effectively mitigate limited resources of router buffering and input buffering,and to ensure that buffers are allocated on demand.In terms of fault tolerance mechanisms,an intelligent network management engine is designed,and an intelligent algorithm for fault detection and fault recovery is proposed to allow automatic main tenance of network stability under fault scenarios.Compared with re-coil routing policy and U-turn routing policy,it achieves better network performance.(2)Key technologies for the optimization of high-speed communications interconnect interfaces for memory interconnect networksAiming the demands of high-performance computing,big data,cloud computing,cognitive computing and etc,converged interconnect network architecture for Exascale high-performance computers is studied,which will provide data ac cess capabilities with low latency and reliable bandwidth balance.Focusing on the memory network architecture and the high-speed communication interconnect interface technology,a memory network architecture adapted to big data processing is first proposed for domestic multi-core processors.The use of memory and interconnect tightly coupled designs eliminates the need for PCI-E interfaces and effectively reduces data transfer overhead,while also providing a large amount of memory sharing capabilities for big data processing computing systems.Secondly,the high-speed communication interconnection interface structure and optimization technology in the memory network storage controller are proposed,including simplified link layer protocol,a serial and source synchronization combined multiple sets of parallel bus channel technologies,and “read command priority” and "inferred write" command scheduling technology,multi-channel parallel bus low-delay skew structure and virtual active page buffer optimization techniques.The optimized high-speed communication interconnect interface can match the two memory channel DDR bandwidths.Tests with the synthetic load and the real load on the domestic processor platform show that the maximum effective bandwidth of the interconnect interface is 14GB/s,the total memory access bandwidth under the 64-thread Stream test excitation is 96.99 GB/s,and the memory access delay is only about 150 ns.The virtual active page buffer structure can increase the bandwidth of the 64-thread Stream Open MP program by 16.86% and the execution speed of the NPB-MPI program by 6%.(3)100Gbps optical serial interface transceiver technology and scalable optical switching technologyFor high-speed interconnect chip,current 50 Gbps serial interfa ce is limited by many constraints,including the power density,resource area,signal integrity and etc,therefore,research on 100 Gbps optical serial interface transceiver technology is done in this thesis.Based on the recent introduction of low insertio n loss silicon photonic switches,an Optical Time Division Multiplexing(OTDM)scheme is proposed.OTMD achieves 100 GBaud transmission,through using cascaded high-speed optical switches to implement multiplexed time-division multiplexing and demultiplexing in the optical path,multiplexing multiple low-bitrate bitstreams onto a single high-baud-rate optical link.Through introducing the dark modulation mode to unify the signal amplitude on the transmission link and solve the problem of crosstalk between clock cycles,this scheme is further optimized and the transmission symbol rate is raised to 125 GBaud.Finally,a high-performance interconnection network architecture based on arrayed waveguide grating routers is proposed.The wavelength division and wavelength multiplexing are used to construct the nested 2D tree topology structure,by using the wavelength-division route characteristics of arrayed waveguide grating routers to reduce the wavelength required by the system,with which a system of 262144 nodes is constructed using 8 wavelengths.In a 100000 nodes system,the number of fibers and switches required for the AWGR interconnection network is only 50% and 35% of the fat tree,and the total power consumption is only about 40% of the fat tree.
Keywords/Search Tags:high-order switching chip, converged interconnection network, optical serial interface transceiver, optical switching
PDF Full Text Request
Related items