Font Size: a A A

Improvement And Application Of Random Forest Algorithm

Posted on:2021-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:X QinFull Text:PDF
GTID:2428330605461061Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Landslide is a severe and universal global natural geological disaster.In our country,the occurrence of geological disasters is cruel.Landslide disasters bring tremendous property loss,infrastructure destruction,and casualties to our country every year,which traumatically destroy the infrastructure construction and economic growth of the areas affected by landslides.Therefore,it is helpful for the protection of landslide disaster to screen the induced factors in the landslide areas,find out the influence of induced factors on the occurrence of landslides and make accurate classifications and predictions out of previously collected data according to these factors.Random Forest algorithm(RF)has been used widely since it first proposed.It is commonly used in classification and regression problems by numerous specialists and scholars because of the advantages it has over many similar algorithms.Although RF algorithm has few parameters and not easy to over-fit,when dealing with unbalanced data,it cannot predict minority class accurately,which leads to a big error between the final classification results and the actual results.The selection of the algorithm parameters directly affects the final classification results.Therefore,it is essential to decide the most optimal combination of RF parameters.In this research,we introduce an enhanced algorithm to solve the problems of biased data and parameter selection which can be faced in the traditional RF algorithm.The main work of this paper is as follows:(1)Briefly describe the basic principles and implementation steps of the RF algorithm,and introduce the related researches of the RF algorithm in detail,and puts forward some improvement ideas for the existing problems of RF algorithm.(2)An Unbalanced Accuracy Weighted Random Forest(UAW_RF)algorithm based on the Adaptive Step Size Artificial Bee Colony(ASSABC)algorithm for parameter optimization is proposed.It combines the ideas of decision tree optimization,sampling selection,and weighted voting to improve the ability of the RF algorithm in dealing with unbalanced data classification.The idea of adaptive step size and the optimal solution are introduced to enhance the position updating formula of the Artificial Bee Colony(ABC)algorithm.Then,the parameter combination of the RF algorithm is iteratively optimized by utilizing the advantages of ASSABC algorithm in the optimization problem.Finally,we choose the segment0 unbalanced dataset of the KEEL database,and the three datasets of an unbalanced binary classification dataset which are formed according to different categories from the KDD CUP 1999 dataset to compare the algorithm performance.The feasibility of UAW_RF algorithm is proved by experiments.(3)Application case validation.The rock slope is a kind of particular geological type of landslides found in the northern part of China-Pakistan highway,which often causes the traffic interruption of China-Pakistan highway.The enhanced algorithm was applied to the study of classification and prediction of the rock slope datasets.2-km range landslide area along the two sides of the Urumqi-Khunjerab in the Gaizi River Valley locality of the China-Pakistan highway was selected as the research area.ArcGIS software is used to extract the data of 11 attributes such as DEM,slope,soil type,rainfall,etc.,from the bitmap and remote sensing images of the study area.The improved algorithm was used to analyze and classify these 11 attributes,and predict the rock slope data.Finally,the performance of the two algorithms was compared with the algorithms of Logical Regression,K-Nearest Neighbor,and XGBoost in the study area.The experimental results show that the UAW_RF model based on the ASSABC algorithm improves the classification performance of the slope dataset.And it is better than Logistic Regression,K Nearest Neighbor algorithm and XGBoost algorithm in classifying the slope dataset in the study area.It can be concluded that this method can provide some decision support for the disaster protection of the sliding slope in Urumqi-Khunjerab section of the China-Pakistan highway.
Keywords/Search Tags:Random Forest Algorithm, Artificial Bee Colony Algorithm, Parameter Optimization, Rock Slope Disaster
PDF Full Text Request
Related items