Font Size: a A A

Improved Random Forest Algorithm Based On Pruning And Its Application To The Classification Of Remote Sensing Image

Posted on:2020-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:X M ZhangFull Text:PDF
GTID:2480306452469304Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
With the development of remote sensing technology,the quantity,type and complexity of remote sensing data are growing rapidly,and remote sensing big data is gaining momentum.Remote sensing image classification is the key of remote sensing information extraction,because of high classification accuracy,difficulty to over-fitting,and strong parallel processing ability,Random Forest classification algorithm are widely used in remote sensing image classification.Remote sensing data is complex,which makes it necessary to train enough decision trees for achieving higher accuracy.However,increase in the number of decision trees requires more computing resources and reduces classification efficiency.Studies have shown that ensemble pruning,that is,selecting some of base classifiers can reduce ensemble size and improve image classification efficiency without affecting the accuracy.In order to improve accuracy and speed of Random Forest algorithm in remote sensing image classification,this paper carried out the following research:Existing ensemble pruning algorithms only consider the behavioral diversity but ignore structural diversity,based on decision tree structure,this paper proposes a new diversity measure called Decision Tree Similarity Measurement.Combining tree matching in graph theory,sequence similarity in pattern recognition and substructure mining,DTSM converts decision tree similarity into similarity of matched sequences.Considering node information,Simple Tree Matching algorithm is used to obtain maximum matching subtrees,then combining the path information,similarity of decision tree is transformed into the similarity of the matching sequence,and finally similarity is measured by comparing the sequence substructures.Combining diversity and accuracy,Pruned Random Forest algorithm(PRF)is proposed.The main idea is to compute the similarity between the two decision trees using DTSM,then selecting some of decision trees based on the similarity threshold and out-of-bag error.The experiment indicates that PRF algorithm can eliminate more than 40%of decision trees without reducing the classification accuracy.In order to get the highest classification accuracy while reducing ensemble size,Pruned Random Forest based on Improved Particle Swarm Optimization(IPSO-PRF)is developed.IPSO algorithm introduces the linear inertia weight and neighborhood factor and is used to optimize the number of random features and similarity threshold,then Pruned Random Forest is trained with optimal combination of parameters.Comparative experiments are made to further indicate that IPSO algorithm can improve the local search ability and convergence speed,IPSO-PRF algorithm achieves the highest classification accuracy with the optimal parameters.Taking the land cover classification of Xiamen as an example to verify the effect of PRF algorithm and IPSO-PRF algorithm in Remote sensing image classification.After Extracting spectral,topographic and texture features from Landsat8 and Digital Elevation Models as classification features,the IPSO-PRF algorithm is used to perform pixel-based land cover classification.The results show that the IPSO-PRF algorithm can not only improve the overall classification accuracy,but also the number of decision trees in ensemble model has been greatly reduced.
Keywords/Search Tags:Remote Sensing Image Classification, Pruned Random Forest, Decision Tree Similarity, Particle Swarm Algorithm, Land Cover Classification
PDF Full Text Request
Related items