| Lithology classification is an important application direction of remote sensing technology in the geological field.Traditional remote sensing lithology classification uses relatively single data and relies mostly on visual interpretation,resulting in low efficiency and poor classification accuracy.In recent years,the emergence of various sensors has provided massive data for remote sensing lithology classification,and machine learning algorithms have been widely used in the field of remote sensing image classification.Combining machine learning algorithms and making full use of multi-source remote sensing data for automated high-precision lithologic classification can provide technical support for geological surveys and geological data updates.This paper selected Duolun County,Inner Mongolia Autonomous Region as the research area,and combined optical,radar remote sensing,and terrain data to carry out machine learning lithologic classification research.The specific research work is as follows:(1)Spectral,amplitude,and backscatter features were obtained from the pre-processed Landsat 8 OLI,GF-2,and GF-3 images.Texture features were extracted from GF-2 and GF-3data through principal component analysis and gray level co-occurrence matrix.Cloud decomposition and Freeman decomposition were utilized to obtain polarization decomposition features for GF-3 data.Using grid calculation to obtain terrain features from Digital Elevation Model data,a total of 63 features were obtained.The method of combining recursive feature removal with random forest was used to rank the importance of 63 features and 22 preferred features were selected.The results show that for the lithology classification of the study area,the importance scores of near infrared and two short wavelength infrared bands in Landsat-8OLI image are the highest,which are 0.0511,0.0466 and 0.0441,respectively,and the number of texture features is the largest,10 in total,accounting for 45% of the number of optimal features.(2)According to different data types and feature selection results,five feature combination schemes were designed.Scheme 1 includes optical and terrain features,Scheme 2 includes Synthetic Aperture Radar(SAR)and terrain features,Scheme 3 includes all features,Scheme4 includes optical and SAR features,and Scheme 5 includes all the preferred features.Four machine learning models,K Nearest Neighbor(KNN),Adaptive Boosting(Ada Boost),Support Vector Machine(SVM),and Random Forest(RF),were used to conduct comparative experiments on lithology classification with different feature combinations.The results show that SVM has the best performance in the lithology classification model in this paper,and SVM,KNN and Ada Boost have the highest classification accuracy in scheme 5,with the overall accuracy being 84.88%,81.22%,75.51%,and Kappa coefficient being 0.8270,0.7856,0.7206.RF model had the highest classification accuracy for scheme 3,and the overall accuracy and Kappa coefficient were 83.09% and 0.8063.The optimal feature combination achieves higher classification accuracy while greatly reducing the number of features.The addition of SAR feature and terrain feature also helps to improve the accuracy of lithology classification.(3)Genetic Algorithm was used to optimize the hyperparameters of SVM and RF models,and the results showed that among the above five feature combination schemes,the classification accuracy of the optimized SVM model had a certain improvement.Scheme 1 had the largest accuracy improvement,while Scheme 2 had the smallest accuracy improvement.The overall accuracy from Scheme 1 to Scheme 5 had been improved by 0.63%,0.11%,0.44%,0.52%,and 0.36%,respectively,and the Kappa coefficient had been improved by 0.0072,0.0011,0.0051,0.006,and 0.0041,respectively.The accuracy improvement of the optimized RF model was not significant.The overall accuracy from Scheme 1 to Scheme 4 had been increased by 0.13%,0.14%,0.13%,and 0.04%,respectively,and the Kappa coefficient had been increased by 0.0014,0.0014,0.0015,and 0.0004,respectively.The overall accuracy and Kappa coefficient of Scheme 5 were decreased by 0.05% and 0.0006,respectively.Compared with RF model,SVM model optimized by genetic algorithm can improve the accuracy of lithology classification more obviously. |