Font Size: a A A

Research On Optimization Of CDN Caching Strategy Based On Machine Learning

Posted on:2022-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:J W ZhangFull Text:PDF
GTID:2518306572991009Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
As a network storage architecture that currently carries about 70% of the global Internet traffic,the content delivery network(CDN)plays an important role in accelerating network applications.CDN caching,as a core component of CDN,has been widely studied to improve the quality of service(Qo S)of CDN.However,with the rapid development of Internet applications,the businesses that CDN served become more and more diverse,resulting in a variety of CDN workloads under existing CDN multi-tier cache architecture challenging the performance optimization of CDN caching.Therefore,performing an indepth analysis of the CDN workloads under different tiers and different businesses,and optimizing the caching strategy based on the findings of the CDN workload analysis,is of great significance to the optimization of CDN caching.Existing CDN workload analysis studies fail to fully characterize all workloads under the current CDN architecture.To solve this problem,we perform an in-depth analysis of the characteristics of CDN workloads under multi-tier and multi-business and study the impact of access patterns on cache performance,aiming to provide guidance for CDN cache configuration.For comprehensive and in-depth workload analysis,first collecting traces of four CDN workloads under multi-tier and multi-business,second extracting characteristics of the four workloads,including file size,request pattern,temporal locality,and popularity distribution.The analysis leads to some interesting insights: first,the distribution of object size is highly correlated with business types;second,the traffic over time shows strong diurnal and weekly patterns;third,temporal locality of Web business differs in different tiers,but not for Vo D business;forth,the access popularity follows Zipfian distribution and there are a large number of one-time-access objects,especially in the middle tier.Evaluating the cache performance of the four workloads under various cache configurations to verify the conclusions of the access patterns of the four workloads and provide caching configuration guidance for various CDN businesses.Since there are a large number of one-time-access objects in CDN workloads,caching such objects causes unnecessary write traffic and cache pollution,resulting in a degradation in cache hit rate.To solve this problem,a One-Time-Access Exclusion cache admission policy(OTAE)is proposed,including a classifier and a history table,to filter one-timeaccess objects.First,determining the criteria of one-time-access objects through quantitative analysis.Based on the criteria,using the decision tree algorithm to build the classifier to preliminarily determine whether the missing object is a one-time-access object.To mitigate the negative impact of false-positive of the classifier on cache performance,it introduces the history table which maintains the metadata of one-time-access objects.The experimental results show that the cache performance is improved with OTAE.Taking LRU for instance,applying OTAE improves the cache byte hit rate by 4.5%,reduces the write traffic by 69.3%,and decreases the response latency by 5.2%.For diversified CDN workloads,existing caching strategies have limitations in performance improvement that the cost-performance ratio of increasing the cache capacity to improve the cache performance is too low.To alleviate this problem,a Last-Time-Access Exclusion cache replacement policy(LTAE)is proposed to improve the utilization of cache space and reduce back-to-source traffic by replacing the last-time-access objects as much as possible.First,quantify the criteria of last-time-access objects for different cache capacities based on sensitivity tests.Based on this,introducing two dynamic features,reuse time and access frequency,to build a classifier using Light GBM algorithm to predict whether the hit object is the last-time-access object,so as to guide the replacement decision.Combined OTAE and LTAE,building a CDN caching framework,named O'LTAE-based caching framework.Experimental results show that the O'LTAE-based caching framework improves the byte hit rate by 1.18%?11.82% on average compared to mainstream cache replacement strategies while reducing write traffic by 53.91%?59.79%.
Keywords/Search Tags:Content Delivery Network, Cache Optimization, Cache Admission Policy, Cache Replacement Strategy
PDF Full Text Request
Related items