| With the advent of the era of big data,data has penetrated into all walks of life today,and the total amount of data generated by human society has exploded,most of which are stored on hard drives in data centers.But due to the physical characteristics of the hard disk itself,if the hard disk is damaged,the data stored on the hard disk may be lost forever.Although the data center can increase the security of data by adopting distributed file system backup storage,it still fails to find a more significant and effective way to solve the risk of data caused by hard disk failure.Accurate,earlier,more immediate,and more convenient failure prediction for hard disks is of great significance for protecting data security and reducing data center operating costs.Therefore,the study of hard disk failure prediction has become one of the hot spots in the current data center field.Firstly,according to the scenario of hard disk data,this paper puts forward the over sampling principal component analysis(SMOTE_PCA)method for the imbalance problem,obtains the principal component with a certain distinction between the faulty hard disk and the normal hard disk,and then puts forward the health evaluation standard based on distance according to the principal component,Using this standard and combined with the gate recurrent unit(GRU)method of deep learning,a hard disk fault prediction model based on gating neural unit(GRU_SMART)is proposed.Finally,using the proposed health evaluation standard,through the comparative experiment with long short term memory networks(LSTM),the effectiveness and efficiency of the model are verified,which reflects GRU_SMART Advantages of smart in the field of hard disk prediction.Through the comparison between GRU_SMART and LSTM,it is obtained that the best model is obtained when the time step used by GRU is 6 in this dataset,while the best prediction model is obtained by LSTM when the time step is 14.Moreover,GRU_SMART is more practical than LSTM in terms of convergence speed,and it is also higher than LSTM in prediction accuracy.The average advance prediction time of hard disk failure is 9 days,which is better than 6 days of LSTM,and can be better applied to real data center scenario. |