| Data Mining is the process of extracting implicit in the data that people do not know in advance, but is potentially useful information and knowledge,and the data come from a large, incomplete, noisy, fuzzy and random of practical application data [1]. Data mining as a new learning technique based on statistical learning theory, and one of the recent popular research subject in data applications area, it has become a hot topic in statistics and machine learning fields, data mining technology has become the most popular technology in big data area. Therefore, data mining theory has been thoroughly studied in recent years, and has been widely used on many areas of modeling, prediction and control.Rockburst is a main cause of microseismic in coal mine, which a dynamic phenomenon of deformation energy to release in mine shaft and the surrounding rock by a sudden, sharp and violent destruction character[2]. Namely, underground coal rock excavation disturbance, coal rock fracture will suddenly release deformation energy of elastic wave in process of stress redistribution, and magnitude generally less than three. In essentially of microseismic in coal mines occurs process is that nonlinear deformation and non-continuous destruction of coal rock structure, with a typical non-continuous and nonlinear characteristics, therefore, linear model is very difficult to predict the microseismic hazard in coal mines due to the complexity of the microseismic process[3] [4].This paper uses data mining techniques to study the problem of the strong seismic mine tremors of high energy( ?JE 104) of microseismic hazard sate in coal mines. Data sets come from located in a Polish coal mine that monitor the energy and pulse once every eight hours[5][6], and the data sets selected from UCI machine learning database. Using machine learning method of nearest neighbor, decision tree, adaboost classification, support vector machines and random forests to deal with these data, with half of the cross-validation of standardized mean square error( NMSE) to determine a variety of machine learning methods reliability of the results, compare with the value of NMSE and the prediction accuracy of the algorithms to analyze the pros and cons of each method so that select the most appropriate algorithm. Using R software to manage the data sets[7-9], and achieve R programming languages of half of the cross-validation NMSE and various machine learning algorithms. Comparing with the error tolerance of nearest neighbor method, decision tree, bagging algorithm, support vector machine and Random Forests find that them all have better error tolerance and classification results. Among them, random forests is an ideal way to analysis the variety attributes and instances problem of microseismic hazard prediction in coal mine. So, random forests is the ideal method to deal with microseismic hazard prediction in coal mines under the high-energy.Through this study we can see that a necessary condition for occurrence of microseismic in coal mine is high-energy shock event, data mining use to analysis the microseismic monitoring data is feasible, analysis of potential links between the various factors and find the formation mechanism and rule occurred of microseismic in coal mine. Although,the paper presents results of microseismic prediction model can not make an accurate prediction of all the proposed coal mine earthquake, there are certain omissions and false positives,but still can identify and predict a considerable part of the microseismic in coal mine events, which provides a reference for reduce earthquake hazard, as well as provides a new way of microseismic hazard prediction in coal mine use data mining. |