Font Size: a A A

Research On Photo Cache Optimization Based On Solid-state Disk

Posted on:2018-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z C WangFull Text:PDF
GTID:2348330566451637Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet,the image number of social network companies,such as Facebook,Tencent grows exponentially.Traditional cache system constructed by disks can no longer satisfy system requirement for performance.Solid-state disk(SSD),a new kind of storage medium,has been increasingly used in cache system.However,traditional cache design applied directly to SSD will lead to excessive invalid writing,not only wasting cache resource,but also shortening lifespan of SSD.Aiming at this problem,cache data set is divided into cold data set and hot data set,then respective new caching strategy is proposed,which can improve cache efficiency and reduce the number of SSD write significantly.For the cold photo set,to solve the cache pollution problem brought by " visit only once" photos,we propose a cache selection strategy based on classification prediction.The strategy applies naive Bayesian classification to divide the data into two categories: " visit only once" and "visit more than once",avoiding " visit only once" photos into cache.During the process of caching,the algorithm makes up for the pictures predicted wrongly by using historical information table,which reduces the probability of false judgment.In addition,the "visit only once" predictive weight can be adjusted dynamically according to cache real-time situation,so that cache space can be used more effectively.For the hot photo set,since popularity of hot data evolves over time and presents a specific trend in the Internet,an improved cache replacement algorithm of RIPQ-GDSF based on popularity trend is proposed.In this improved algorithm,access frequency is replaced by photo heat,which can predict and exploit popularity trend of photo access efficiently..In addition,the algorithm optimizes cache replacement from file level to block level by combining RIPQ,which reduces the number of SSD write.The time complexity can be decreased from O(n)of GDSF to O(1).A large number of simulation experiments using real world trace have been conducted,the results show that the caching selection strategy based on classificationprediction can outperform traditional caching algorithm up to 22% in hit rate and 51% in SSD write reduction.The cache replacement algorithm of RIPQ-GDSF based on trend improvement is also better than traditional caching algorithm,up to 22% in hit rate and 92%in SSD write reduction.
Keywords/Search Tags:SSD, Photo cache, Naive Bayesian Classification, Popularity, RIPQ-GDSF
PDF Full Text Request
Related items