Font Size: a A A

Research On Service Performance Optimization Problems In Data Centers

Posted on:2022-12-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:1488306764958459Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid evolution of the Internet,cloud computing,and artificial intelligence,Data Center has become essential infrastructure for modern society.Data Center service providers deploy various services in their data centers and provide services to users.It is a critical optimization goal for data center service providers to reduce Capital Expenditure and Operating Expenses.As a prerequisite for optimizing data center services,service providers can measure the data center network to collect the data center's working states and gather the characteristics of these services.The data center service provider can optimize the various services deployed in the data center based on the measurements.These services can be categorized as Network Intensive Service,Storage Intensive Service,and Computation Intensive Service.The main contributions of the dissertation are as follows.1.Network Measurement Framework Design on Programmable SwitchesAccurate and fine-grained traffic measurements are crucial for various network management tasks.Recent researches introduce counter-based and sketch-based approaches to traffic measurement.However,implementing accurate and fine-grained traffic measurements is very challenging due to the rigid constraints of measurement resources.Chapter 2of the dissertation aims to design efficient traffic measurement schemes for programmable networks.The dissertation proposes a single-node traffic measurement scheme called Flex Mon to measure fine-grained flows in a single network node accurately.The FlexMon separates large flows from small ones and uses dedicated flow rules and sketches to measure large and small flows,respectively.The dissertation implements Flex Mon on FPGA and CPU to process five typical measurement tasks.Experimental results show that the single-node measurement schemes can achieve much faster speed and higher accuracy than the state-of-the-art.2.Date Center Network-wide Measurement Optimization ProblemsFine-grained and accurate network flow measurements are essential for various network management tasks.In recent years,the evolution of programmable networks enables flow measurement on the switch.However,limited hardware resources on programmable switches drive network measurement shifts from a single switch to network-wide coordinations.Chapter 3 of the dissertation aims to optimize the allocation strategy of flow measurement among switches under the objective of measurement coverage and accuracy in network-wide measurement scenarios.The dissertation proposes a network-wide traffic measurement scheme to support network-wide measurement.To further improve the measurement performance by efficiently leveraging the network-wide measurement resource,the dissertation designs a Graph Neural Network model,Neural Mon,that can precisely model and solve the above problem.Neural Mon converts network topologies and network flows into a hypergraph and transforms the flow measurement task allocation problem into a node classification problem.Neural Mon effectively learns the task allocation solution from the network topologies and flows directly.Even on untrained real-world network topologies,Neural Mon still provides excellent performance.3.Virtual Switch Dynamic Resource Allocation Optimization in Data CentersIn data center networks,the virtual machines hosted in a server are connected to a virtual switch responsible for forwarding all packets for services deployed on the virtual machines.Therefore,the virtual switches will become a performance bottleneck of Network Intensive Services without an efficient resource allocation and data scheduling strategy.However,the highly dynamic characteristics of Network Intensive Services make the resource allocation and packet scheduling problem for virtual switches surprisingly challenging.To guarantee the performance of Network Intensive Services,Chapter 4 of the dissertation investigates the joint optimization problem of dynamic resource allocation and packet scheduling for virtual switches.The dissertation models the joint optimization problem of dynamic resource allocation and packet scheduling for virtual switches as a mathematical optimization problem.Then,The dissertation analyzes the problem with Lyapunov Optimization Framework and derives efficient optimization algorithms with performance tradeoff bounds.At last,the dissertation evaluates these algorithms on a testbed and a network-wide simulation platform.Experiment results show that our algorithms outperform other designs and meet the theoretical performance bound.4.Collaborative Caching System performance Optimization in Data CentersAs an important data center Storage Intensive Service,the data center distributed cache system is widely deployed in data centers to provide timely data storage services to users.It reduces the content fetching latency and traffic load.However,implementing efficient caching in data centers is challenging.Chapter 5 of the dissertation aims to enhance the overall caching efficiency of the cache system in data centers.To this end,the dissertation designs a cooperative cache model,where a content object is only cached at a single node to eliminate caching redundancy,and all cache nodes serve the content requests of users cooperatively.Then,the dissertation studies the content caching optimization problem that decides the optimal caching places for content objects under the cooperative cache model.Specifically,The dissertation formulates the content caching optimization problem and solves the problem efficiently.The experiment results demonstrate that the proposed cooperative cache model and the content caching strategy achieve much better performance on cache hit ratio and content access latency than state-of-the-art solutions.5.Sparse Neural Network Inference Performance Optimization in Date CentersNeural network inference is an essential data center Computation Intensive Service.Leveraging sparsity in deep neural network(DNN)models is promising for accelerating model inference service in data centers.Yet existing GPUs can only leverage the sparsity from weights but not activations,which are dynamic,unpredictable,and challenging to exploit.Chapter 6 of the dissertation proposes a novel architecture to efficiently harness the dual-side sparsity(i.e.,weight and activation sparsity).The dissertation takes a systematic approach to understand previous sparsity-related architectures'(dis)advantages.It proposes a novel,unexplored paradigm that combines outer-product computation primitive and bitmap-based encoding format.The dissertation demonstrates the feasibility of the design with minimal changes to the existing production-scale inner-product-based Tensor Core.The dissertation also proposes a set of novel ISA extensions and co-designs the matrix-matrix multiplication and convolution algorithms,the two dominant computation patterns in today's DNN models,to exploit our new dual-side sparse Tensor Core.The evaluation shows that the design can fully unleash the dual-side DNN sparsity and improve the performance with small hardware overhead by up to one order of magnitude.
Keywords/Search Tags:Data Center Network, Network Measurement, Data Center Resource Allocation, Distributed Cache System, Sparse Neural Network Inference
PDF Full Text Request
Related items