Font Size: a A A

Design And Implementation Of Monitoring System For Disk Status Based On Deep Learning

Posted on:2023-10-19Degree:MasterType:Thesis
Country:ChinaCandidate:J CaoFull Text:PDF
GTID:2568307046965449Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The explosive growth of massive data has brought new challenges to the capacity,performance and reliability of storage devices.As important storage devices in modern storage systems,hard disk drives are prone to cause mechanical drawbacks which results in the risk of data damage and loss.Therefore,efficient monitoring of disk status and timely warning of disk failures are conducive to the establishment of a highly available storage system.Disk failure prediction systems established by existing research usually have the limitations of simple disk lifetime quantification,special application scenarios or high costs for online deployment,so they are not suitable for general scenarios.Currently,Deep Learning(DL)technology can be used to predict the disks’ health intelligently and concisely,which provides an important solution for systematically monitoring disk status.Aiming at the problem of how to efficiently and precisely predict disk failures,this paper proposes a system named OHDMS(Online Hard Disk Monitoring System)for online monitoring and prediction of disks’ health state.Specifically,it carries out secondary definitions for disk’s lifetime prediction,covering full and remaining 24-day lifetime of the disk respectively.The primary prediction stage covers full lifetime of disk,divides the lifetime into 5 stages with the granularity of multiples of 24 days,and monitors whether the monitoring disk life drops to only 24 days.The secondary accurately divides the lifetime into 3 stages with the granularity of multiples of 4 days,and predicts the remaining useful life.Considering factors such as FDR,FAR,and model prediction accuracy,the prediction model is derived from the comparison of mainstrain DL models.In addition,OHDMS adopts a combination of offline training and online retraining.On the basis of offline training,OHDMS can collect disk data stream for retraining the DL model online,not only to transfer to minority-disk-type model based on shared parameters,but also to fine-tune the old model to prevent the aging process and maintain stable performance.The OHDMS prototype is implemented in the realistic server environment.The performance of disk health status prediction is compared based on different DL models,and the model is updated and transfered via collecting data streams online.The experimental results show that application of the CNN-LSTM model and the deep LSTM model in two prediction stages can predict life stages of disk accurately,and achieve FDR of 92%-97%with FAR of about 2%-4%.Under online learning,the updated model has almost doubled the prediction accuracy compared with the old model,and the model via transfer learning has achieved accuracy of more than 86%.
Keywords/Search Tags:Disk Failure Prediction, Deep Learning, Transfer Learning, Long Short-Term Memory
PDF Full Text Request
Related items