Font Size: a A A

Rapid Detection And Localization Of Multiple Gray Failures In Data Centers Via In-band Network Telemetry

Posted on:2023-10-22Degree:MasterType:Thesis
Country:ChinaCandidate:C H JiaFull Text:PDF
GTID:2568306914962479Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
As data centers get used more and more extensive and in-depth in the information technology industry,data center networks and their reliability and stability have also received more and more attention.Users hope that the data center network can still work continuously when network failures occur,and can quickly locate the root cause of the failure to help the network return to the normal state.However,there is a kind of network failures called gray failures,which will not generate clear alarm information when they occur,so they can often cause huge damage to the network.In order to improve the network’s ability of detecting and locating gray failures,and at the same time reduce the damage caused by them,this paper proposes a network failure detecting and locating system based on in-band network telemetry called INT-detect.INT-detect detects the connectivity of all feasible paths for data packets in the network through in-band network telemetry and stores the telemetry results in the edge server.When network failures occur,the edge server will detect network failures based on the abnormal telemetry results.Secondly,the edge server customizes the forwarding path of data packets based on the telemetry results and source routing,and migrates the damaged traffic to normal paths in real time.At the same time,according to the faulty paths,FTA-pinpoint,a network failure location algorithm based on fault tree analysis,is used to locate the network failures.To verify the feasibility of INT-detect and FTA-pinpoint in detecting network failures,migrating damaged traffic and locating network failured,this paper uses Mininet,a simulation network construction tool,and BMV2 software P4 switch to evaluate INT-detect and FTA-pinpoint.The evaluation results show that INT-detect can detect network failure and migrate damaged traffic in a relatively short time,and also show the performance of FTA-pinpoint under different parameter conditions and suggestions on how to improve the performance of FTA-pinpoint in practical scenarios.
Keywords/Search Tags:Data Center Networks, Network Failure, Inband Network Telemetry, Source Routing, Fault Tree Analysis
PDF Full Text Request
Related items