| In recent years,with the rapid development of the domestic market for second-hand electric vehicles,the number of second-hand electric vehicles is increasing.Due to various reasons,more and more costumers choose to use second-hand electr ic vehicles on daily basis leading to the growth of the transaction volumes of secon d-hand electric vehicles and the expanding of the market for the second-hand electr ic vehicles.At the same time,some problems in the second-hand electric vehicles i ndustry is exposed.For example,the second-hand electric vehicles industry is lake of the uniform methods for estimating preservation rate of second-hand electric veh icles.This has been one of the main effects which hinder the development of the se cond-hand electric vehicles industry.At present,the preservation rates of second-h and electric vehicles are always provided by the sellers according to their experienc es.Such a method of estimating the preservation rates is too subjective.To make a healthy and steady development of the second-hand electric vehicles industry,it is necessary to develop an objective and precise estimation methods of the preservati on rates of second-hand electric vehicles.First,this thesis summarizes the results of the existing literatures relating to the studies on second-hand fossil-fuelled vehicles and second-hand electric vehicles and determine twelve potential variables,including brand,mileage,years of driving and so on.Second,by employing the variable screening methods including traditional feature selection method,Lasso regression and Boruta algorithm,this thesis finally identifies ten factors including brand,series and so on as significant variables.Third,this thesis applies random forest algorithm and GBDT algorithm,respectively,to establishing prediction models between second-hand electric vehicle preservation rate and ten significant variables.To compare the two models obtained thorough random forest algorithm and GBDT algorithm,the ten-fold cross-validation method is used.The comparison results show that the precisions of the predictions provided by the random forest algorithm is better than that provided by the GBDT algorithm.To further examine the performance of the random forest algorithm,is is discussed whether the built random forest model is over fitted and its robustness.The results show that the built random forest model is not over fitted and robust.As a summary,the built random forest model is applicable to estimating the preservation rate of second-hand electric vehicles and has considerable practical values. |