Font Size: a A A

Research On Failure Prediction Method For Disk Time Series Data In Cloud Data Center

Posted on:2022-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:F K PengFull Text:PDF
GTID:2518306326497584Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the rapid progress of information technology and the rapid development of the emerging field of cloud computing,the total amount of data generated by human society has exploded,and most of these data are stored in the disk of the cloud data center.Due to the special physical mechanism of the disk,the stored data may not be retrieved after failure.Even if there are fault-tolerant mechanisms inside the disk,these mechanisms are lagging measures or have a low failure detection rate and cannot guarantee data security.If disk failures can be predicted in advance,operation and maintenance personnel can take measures to ensure data security,maintain system stability,and reduce operating costs.Therefore,it is of great significance to predict disk failures in cloud data centers.Researchers have conducted extensive research on disk failures and achieved good results,but they still face many practical challenges.For example,disk data has problems such as universal missing values,numerous attributes,unbalanced samples,inaccurate labels,and insufficient use of the temporal and spatial characteristics of disk data.In view of the above problems,the main work of this paper is as follows:(1)Aiming at the complex disk data in cloud data center,cubic spline interpolation is used to fill the missing values of time series data.For many complex attributes,the rate of change of time series data is used to expand the attributes,and then the method based on information gain ratio is used to select features.The selected features can better represent the state change of disk data;(2)In order to solve the problem of the imbalance of the disk data and the inaccuracy of the label,This paper proposes the Failure Reset Window(FRW)as the main data processing method.FRW data processing can not only solve the sample imbalance,but also reduce the potential fuzzy samples and enhance the availability of data;(3)In order to make full use of The temporal and spatial characteristics of disk data.This paper proposes a CNN-LSTM disk failure prediction method based on a combination of Convolutional Neural Network(CNN)and Long Short-Term Memory(LSTM),and CNN extracts data LSTM effectively captures the dependence relationship between time series,and the combined model further improves the failure prediction rate of the prediction model.In order to verify the effectiveness of the FRW+CNN-LSTM fault prediction method proposed in this paper,this paper conducts comparative experiments based on the open source data sets of three cloud data centers.Experimental results show that the FRW-based data processing method can select features more reasonably and label samples more accurately.The CNN-LSTM prediction model can effectively use the temporal and spatial features of time series data and improve failures by 3%-20%compared with other algorithms.The forecast rate has a mean forecast leading time of up to 12 days,and the false alarm rate reaches a low level of 0.15%.
Keywords/Search Tags:Cloud Data Centre, Failure Reset Window, Cutting Window, CNN, LSTM, Disk Failure Prediction
PDF Full Text Request
Related items