| As the key infrastructure for carrying data center computing,storage,and transmission,data center networking is undergoing profound changes in its scale,composition,structure,function,control,and application.How to study and analyze data center networking resource management and control theories and methods,reveal data center networking operation principles,improve data center networking operation efficiency,and save network energy consumption and construction costs are important research contents of data center networking management and control,and have very important theories and realistic meaning.This paper studies the data plane of the software-defined data center networking from three perspectives:resource management and control methods,link resource management and control,and buffer resource management and control.The main research results and innovations include:(1)For resource aware method,an in-band network telemetry packet loss detection,location and missing telemetry data recovery mechanism based on alternate marking performance measurement is proposed.Firstly,the reliability of the existing in-band network telemetry reference protocol is analyzed,two coding schemes including single-bit alternate marking and multi-bit cyclic marking are proposed,and packet loss detection and localization algorithms under single-path forwarding and multi-path load balancing mechanisms are designed;Secondly,a packet loss root cause classification algorithm is proposed,which provides network state label information for missing data recovery by analyzing various packet loss features including congestion,black hole and random packet loss,and alternately marking missing results.Then,the design is based on generative adversarial The network’s missing telemetry information recovery mechanism mines the packet loss characteristics of different packet loss modes to make the filling value closer to the original value;finally,open-source the first in-band network telemetry packet loss public dataset.Experimental results show that the proposed mechanism achieves high detection accuracy,diagnostic accuracy and recovery performance on this dataset.(2)For link resource management and control,a data center network traffic scheduling mechanism based on preference sorting and bilateral matching is proposed.Firstly,treating flows and paths as Participants in bilateral matching in data center networks,the mathematical model of the path-flow matching problem is given,and the data center pathflow matching problem considering the preference order is modeled;secondly,a bilateral matching algorithm considering both the stability and optimality of the matching results is proposed;finally,based on The Floodlight controller implements the aforementioned mechanism.Experimental results show that,compared with typical traffic scheduling schemes such as ECMP,Hedera and Fincher,the bandwidth utilization of the proposed mechanism is improved by 1.53%-109.09%,and the completion time of short-flow tasks is optimized by 1.56%-34.91%.(3)For buffer resource management and control,a centralized RoCEv2 network ECN/PFC watermark automatic tuning mechanism for inventory switching equipment is proposed.Firstly,three typical cases of improper watermark configuration and the results of ECN/PFC watermark configuration space traversal under three business scenarios are analyzed to prove the feasibility of watermark optimization;secondly,the DCQCN Fluid model is improved and the impact of ECN/PFC configuration on network performance is analyzed.Finally,an automatic optimization mechanism for watermark configuration based on simulated annealing algorithm is designed,and(in-band)network telemetry is used to collect switch throughput and queue delay information to optimize network telemetry overhead.The experimental results show that compared with the empirical watermark scheme,the equipment manufacturer’s recommended configuration watermark,and the DCTCP and DCQCN reference watermark configuration schemes,the proposed mechanism has significantly improved performance in standard test scenarios,Redis storage scenarios and Multihost scenarios;The proposed mechanism can avoid the problem of interference between switches,with high search efficiency and low deployment cost. |