Font Size: a A A

Research On Configuration Parameters Recommendation Method For Storage System

Posted on:2023-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z X RenFull Text:PDF
GTID:2568307046964569Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In order to adapt to the user’s specific usage environment and meet users’ needs for system performance,storage systems generally provide a large number of configurable parameters.However,using the default configuration does not guarantee optimal performance under any circumstances.Therefore,it is necessary to use the configuration parameter recommendation algorithm to optimize the system configuration by adjusting the value of the parameter.At present,the time overhead of a single performance test(sampling)of a distributed storage system is very high,which leads to the recommendation algorithm needs to obtain relatively optimal parameter values within a limited number of sampling times.Therefore,the recommendation algorithm should perform each sampling carefully,and it needs to make reasonable and full use of the sampling data.This thesis proposes an improvement scheme for the problems in the existing configuration parameter recommendation methods.Aiming at the problem that the existing algorithms cannot perceive the target space in the sampling stage,we proposed an adaptive progressive Latin hypercube sampling algorithm.We used the random forest regression model to dynamically identify the under-sampling area and selectively perform progressive sampling,so that the sampled data can reflect the actual distribution of the target space.Aiming at the collinearity problem of the regularization algorithm in the parameter selection stage,we proposed a parameter importance calculation method based on random forest.We used random forest combined with Spearman algorithm,which can better reflect the dependence between parameters and improve the accuracy of feature selection.In order to make full use of the sampled data in the optimization stage,we proposed an optimal parameter value prediction method.We used the Bayesian optimization algorithm to predict the optimal value of each parameter for the regression model established in the sampling stage,and recommended a set of configurations for users in advance.Finally,we tested our methods by using Cosbench as a performance test tool and tried to recommended a configuration for a Ceph clusters.In terms of optimizing the average processing time,the experimental results showed that the optimal configuration obtained by the adaptive progressive Latin hypercube sampling algorithm combined with the improved parameter importance calculation method improved the comprehensive performance by 39.72% compared with the default configuration.Compared with the existing method,the optimization gain was improved by 3.59 times.The optimal parameter value prediction method only used the sampled data to give a relatively good configuration,and the predicted comprehensive performance was improved by about 8%compared with the default configuration.
Keywords/Search Tags:Storage System, Configuration Recommendation, Auto-tuning, Latin hypercube sampling, Random Forest
PDF Full Text Request
Related items