Font Size: a A A

Feature Engineering And Susceptibility Analysis Of Generalized Landslide Hazard

Posted on:2022-05-02Degree:MasterType:Thesis
Country:ChinaCandidate:X LingFull Text:PDF
GTID:2480306353468374Subject:engineering
Abstract/Summary:PDF Full Text Request
The landslide susceptibility analysis refers to the analysis the possibility of landslide under a series of triggering environmental factors,which can be divided into qualitative and quantitative methods.In recent years,with the development of science and technology,real-world data has become increasingly quantified and diversified.At the same time,the gradual promotion of complex machine learning algorithms has transformed the quantitative analysis of landslide susceptibility toward intelligence and automation.However,massive amounts of data and complex methods also bring about problems such as data redundancy,overfitting,and decreased interpretability.Feature Engineering processions represent a series of important data preprocessing methods,which can effectively reduce data redundancy and find the dominant factors of the event.In this study,Tianshui City,Gansu Province was taken as an example to carry out the experiments of generalized landslide susceptibility analysis and corresponding characteristic engineering.First of all,the original datasets that assumed to be related with landslide,collapse,as well as unstable slopes,are processed into attribute features,then feature procession and feature selection are performed.By combining and analyzing different selection results,the features considered less important are eliminated,and landslide susceptibility evaluation is carried out.The main achievements of this research come as follows:(1)Aiming at solving the problem that a unified standard for feature processing is lacked,2dimensionless processing methods were tested,the result come out that standardized processing can significantly achieve the purpose of feature enhancement,which can greatly improve the accuracy of the model,especially for some models with higher data requirements.(For example,Support Vector Machine model),the promotion of model score arose up to 12.54% for the highest,and AUC to 6.11% as well.(2)To solve the unclear of selection criteria for hazard causal features that lead to a problem of low model interpretability,poor computational efficiency and data redundancy,4 feature selection methods are proposed and tested,which all based on the embedded method built in the random forest model.The thoughts of filtering method and the wrapped methods are combined,meanwhile,2 indicators,respectively mean impurity and mean accuracy are selected to perform feature selection methods.The results are compared.It is shown that all of feature selection methods are proved to be effective in improving the accuracy,the wrapper-embedded methods reached a highest improving deviation ranging from 1% to 2%.However,the problem of overfitting and local optimal solution is serious,and the time cost is high.(3)In this study,4 machine learning models were selected and tested,namely Classification and Regression Tree model(CART),Random Forest model(RF),Logistic Regression model(LR),and Support Vector Machine model(SVM).For all kinds of landslide hazards,the random forest model achieves the highest accuracy of susceptibility evaluation,respectively 94.89%(landslide),95.71%(collapse),and 94.89%(unstable slopes),and in the mapping results,3 different landslide hazard are better distinguished,and the high-risk areas are better separated from mid-high risk areas.Through the analysis of the results,it can be seen that the generalized landslides in the study area are relatively more affected by human activities,and most of the areas with high susceptibility are concentrated in river valleys,wetlands,and farming areas with weaker lithology and denser rivers.
Keywords/Search Tags:landslide, landslide susceptibility analysis, random forest algorithm, feature engineering
PDF Full Text Request
Related items