Font Size: a A A

Research On Data Center Flow Type Prediction Based On Deep Learning

Posted on:2021-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:J Q ZengFull Text:PDF
GTID:2438330611954125Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
In the data center network,predicting flow type is the basis to achieve optimal flow scheduling,while the existing prediction methods have some improvement space in accuracy,control overhead,prediction time,flow granularity and so on.Thus,based on the multi-dimensional feature characterization ability of deep learning and the characteristics of centralized control of software-defined networks(SDN),in allusion to data center's different demands of flow granularity and model size,the following three solutions are put forward in this paper:In allusion to the prediction of elephant flow,a two-stage prediction mechanism of edge pre-classification and central fine classification is put forward.The residual network algorithm and the Softmax cross-entropy loss function with cost sensitive property are utilized in pre-classification model,while the residual network algorithm and Additive Margin Softmax cross-entropy loss function are utilized in fine classification.Firstly,a random forest model is utilized to screen ten features of three dimensions: the time distribution features of the flow,the real-time features of the flow and the packet header features.Then,most of the mouse flow is filtered out by preclassification model deployed on the SDN switch at the edge of network.Finally,the elephant flow is accurately recognized by the fine classification model deployed in SDN controller.Experiments based on the public data set show that,for the first five packets of flow,the recall rate can reach 91%,the accuracy rate can reach 93%,the control cost is 0.1kbps,and the prediction time is 7ms.Compared with existing mainstream mechanisms such as Flow Seer,ESCA,NELLY,all performance indexes of the mechanism put forward have been improved.The Matthews correlation coefficient is 2.52 times that of NELLY,the prediction time is reduced to 0.35% of Flow Seer,the overhead control is reduced to 0.046% of ESCA.In allusion to the prediction of multi-flow type,another two-stage prediction mechanism is put forward.The unidirectional single-layer GRU(Gate Recurrent Unit)algorithm and the Softmax cross-entropy loss function with cost sensitive property are utilized in the first-stage model.The two-layer GRU algorithm and Softmax crossentropy loss function are utilized in the second-stage model,between which,the double-direction is utilized in the first layer,while the single-direction is utilized in the second layer.Firstly,the distribution of flow rate in flow size and flow duration is analyzed to determine the threshold of classification.Secondly,ten features are screened through variance selection method and random forest model.The two kinds of flow with short flow duration are filtered out by the first-stage model deployed on the SDN switch at the edge of network;then the rest flow is further recognized by the second-stage model deployed on the SDN controller.Experiments based on the public data set show that,for the first five packets of flow,the accuracy rate reaches 86%,Kappa reaches 0.79.In order to reduce the complexity of the prediction model,a model compression method based on knowledge distillation is put forward.Firstly,a teacher model and a student model are designed.Then,the trained teacher model and actual labels of training data are utilized to guide the training of student model.Experiments based on the public data set show that the size of student model can be reduced to about 20% of the teacher model,and when the accuracy rate and recall rate of the teacher model reach 92.18% and 91.31% respectively,the accuracy rate and recall rate of the student model can reach 89.81% and 88.71% respectively.
Keywords/Search Tags:Deep learning, Software Defined Data Center Networks, Flow type, Prediction
PDF Full Text Request
Related items