Font Size: a A A

Research On Earthquake Prediction Based On Machine Learning Regression Algorithm And Its Application In China Seismic Experimental Site

Posted on:2022-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ShiFull Text:PDF
GTID:2480306509499824Subject:Geophysics
Abstract/Summary:PDF Full Text Request
Earthquakes are natural phenomenon with the characteristics of sudden occurrence and destructiveness,which may bring catastrophic disasters and losses.Earthquake forecasting is a difficult and complex problem.For a long time,scientists have carried out various researches on earthquake forecasting,put forward a series of earthquake forecasting models,and made great progress.But they still cannot meet the urgent needs of the development of nowadays society.In recent years,with the development of seismic and geophysical observation means,seismic observation data has increased dramatically.Machine learning methods suitable for big data have shown broad application prospects in earthquake forecasting research.On the basis of summarizing the existing work,this thesis carries out preliminary research on earthquake forecasting based on machine learning regression algorithm,which takes China Seismic Experimental Site(CSES)as the research area,and takes the instrumental-recorded earthquake catalog as the data source.Firstly,several commonly used machine learning algorithms are summarized and analyzed,and 4 machine learning algorithms are selected from them,namely Generalized Linear Model(GLM),Random Forest(RF)and Gradient Boosting Machine(GBM)based on the classification and regression tree(CART),and Deep Neural Network(DNN).Earthquake forecast models are built based on them.And the Stacking integrated learning algorithm with the method of cross validation is used to integrate the 4 models,which combines the forecasting results of several single models by a secondary algorithm to improve the effect of forecasting.Secondly,the 1970-2018 earthquake catalog of CSES is obtained according to the national earthquake catalog and the Sichuan-Yunnan regional catalog.Considering that the incomplete earthquake catalog caused by the temporal and spatial differences in the monitoring capability of the seismic networks will affect the calculation of seismicity feature parameters and thus affect the forecasting effect of the machine learning models,a combination method of the Magnitude-Sequence-Number method,the Maximum Curvature(MAXC)method,and the Goodness-of-Fit Test(GFT)method is used to analyze the spatial distribution feature and the temporal variation feature of the minimum magnitude of completeness(Mc)for the catalog of CSES.The Mc for the earthquake catalog of CSES in different seismic zones and different time segments are then obtained.In this thesis,the minimum magnitude of completeness for CSES is determined as 2.5.Thirdly,the commonly used seismicity feature parameters are analyzed and compared,and 16 feature parameters are selected as the input variables of the machine learning models,including magnitude-frequency distribution parameters,seismic frequency parameters,seismic energy parameters and comprehensive parameters.These feature parameters are sliding calculated to construct data sets.The machine learning models selected are trained and tested on these data sets and then the test results are compared.The results show that the window length used in the construction of the data set has a great influence on the prediction results.The models trained on the data sets constructed by the scalable window length which is suitable for the seismicity of the seismic zones,are obviously better than that of the models with fixed windows.Then,4 methods for evaluation,including absolute mean error(MAE),determination coefficient(),regression error characteristic curve(REC)with the corresponding area over the curve(AOC)and -score,are used to analyze and evaluate the forecasting effect of the models.The results show that the RF has the best forecasting effect among all models.GBM is inferior to RF.GLM and DNN have poor effect.The Stacking models are close to RF,and have no significant improvement.The forecasting effect of each model is better in the range of ML 4.0~6.9,followed by 3.0~3.9 and 7.0~7.9,and worse in 2.5~2.9.The forecasting effects of these models vary greatly in different seismic zones,among which Songpan-Longmenshan Zone,Longling Zone,Lancang-Gengma Zone and Sipu Zone have better effects,while Aba Zone and Litang-Muli Zone have worse effects.The forecasting effect of each model in each zone and each magnitude range is basically the same as the overall result.The -score of RF and the integrated models is relatively high,which has better forecasting performance,followed by GBM and DNN,and GLM model is poor.Finally,the contribution of the 16 seismicity feature parameters to the forecasting results in 4 models are analyzed.The results show that the magnitude-frequency distribution parameters have a greater contribution,followed by seismic energy parameters,then followed by comprehensive parameters,while the frequency parameters are rather low.Moreover,the contribution of each feature parameter of each model varies greatly in different seismic zones.
Keywords/Search Tags:Earthquake forecasting, Machine learning regression models, Effectiveness evaluation, China Seismic Experimental Site
PDF Full Text Request
Related items