Font Size: a A A

Analysis Of Factors Influencing The Value Retention Rate Of Second-hand Cars

Posted on:2020-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:X Y CaoFull Text:PDF
GTID:2432330578978878Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of economy,China has ushered in the development wave of "Internet +" and "mass innovation",which has also brought many development opportunities to the second-hand car industry.This thesis conducts descriptive statistical analysis on 6425 pieces of data collected from the website of direct selling of second-hand cars of melon seeds,and studies the influencing factors of the second-hand car maintenance rate through the establishment of statistical model and algorithm model,aiming to provide users with a method to evaluate the second-hand car maintenance rate and help users make better decisions.This article's main idea is to establish a logistic regression,decision tree,the random forests and XGBoost four models,according to the confusion matrix,it is concluded that the prediction accuracy,to evaluate the above four models,choose the most correct model for the optimal model,and from then on,the optimal model have important influence on the second-hand vehicle resale value variables.The contribution of current study are as follows:1.Collect and collate data.There are 6425 second-hand car information collected by octopus collector,including the original price and quotation of second-hand car,use condition,basic properties,power condition,internal and external configuration,fault diagnosis and other aspects.These data are cleaned and feature constructed to fit the model selected in this thesis.2.Descriptive statistical analysis of data.Descriptive statistical analysis of second-hand car data is made from six aspects: maintenance rate,use condition,basic attribute,dynamic condition,internal and external configuration and fault diagnosis.Through the descriptive statistical analysis of these indicators,understand their distribution rules,preliminary exploration of the relationship between various indicators and second-hand car maintenance rate.3.Establish the logistic regression model of second-hand car maintenance rate.In this thesis,the preservation rate is defined.The preservation rate is arranged from high to low.The preservation rate of the top 30% is defined as high preservation rate,and the others are low.Then,logistic stepwise regression and BIC criteria were used to select variables for influencing factors.Through the regression of all variables,the confounding matrix graph is drawn.The high accuracy indicates that the selected model is suitable.4.Establish the classification algorithm model of second-hand car maintenance rate.Firstly,the decision tree model is established by CART algorithm.Then,based on the decision tree,the random forest model and the XGBoost algorithm model are established.At last,confounding matrices are prepared for the three models respectively to judge whether the three models established are suitable or not,and the importance ranking of variables under each model is obtained.Through the above research,found that the selection of four kinds of models can be used to study the influence factors of second-hand resale value,including the classification of the random forest model prediction accuracy is higher,the effect isbetter,therefore,random forest model for the optimal model,and then from the model can be used car resale value of high and low and the registration time,mileage,horsepower,wheelbase,vendors,etc.
Keywords/Search Tags:second-hand vehicle preservation ratio, logistic regression, Decision Tree, RandomForest, XGBoost
PDF Full Text Request
Related items