Font Size: a A A

Prediction Of UBI Auto Insurance Indemnity Rate Based On Machine Learning Algorithm

Posted on:2020-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y G WangFull Text:PDF
GTID:2428330590993114Subject:Insurance
Abstract/Summary:PDF Full Text Request
Since 2012,China's car ownership has steadily increased,the gradually liberalized fee-changing policy and rapid development of Internet technology have prompted a large number of non-life insurance companies with auto insurance business as the main business pillar to begin to transform into digital and technological insurance companies.With the development and improvement of the Internet,Internet of Vehicles and artificial intelligence technology,the auto insurance business data has expanded rapidly.It is important to use existing resources and tools to obtain effective information from massive data,accurately locate effective customers,improve the competitiveness of auto insurance products,and improve the risk management and control level of auto insurance business.Based on the real UBI auto insurance return data of a non-life insurance company,this paper selects the GBDT and LightGBM algorithms in the machine learning algorithm to realize the UBI auto insurance user loss ratio prediction.From the predicted value of the accuracy of the model and other aspects of complexity compared with the conventional generalized linear regression model.The research results show that the machine learning algorithm model has better performance in the prediction accuracy of the loss ratio,and has the advantages of no requirement for the prior distribution of the dependent variable and the independent variable,high efficiency of processing large-dimensional insurance data,and complete retention of the independent variable dimension.However,there are also disadvantages such as a high negative value of the predicted value with a large fluctuation range,independent variables are not interpretable for dependent variables,and a high requirement on the mathematical ability and programming ability of the algorithm user.It is found that the machine learning algorithm model is complementary to the traditional generalized linear model.The machine learning algorithm is more suitable for a large amount of basic data,which requires high prediction accuracy and neglects the interpretability between independent variables and dependent variables.The traditional generalized linear model is more focused on the reflection of the interaction between the independent variable and the dependent variable.In practice,different models should be selected according to different business scenarios.The combination of machine learning algorithm model and traditional generalized linear model can help improve the accuracy of data analysis results and enhance the comprehensive competitiveness of enterprises.The existing research on machine learning algorithms applied to auto insurance business mostly uses traditional auto insurance business data.The research problems mostly focus on the frequency of claims,the strength of claims,the number of claims,the prediction of accumulated claims and the identification of insurance fraud.Algorithms are used to achieve classification tasks,and regression methods are rarely used to solve prediction problems.It is of academic significance and important practical practice to study the regression-based machine learning algorithm to solve the UBI auto insurance loss ratio prediction problem.In the academic aspect,it first helps to enrich the application of machine learning algorithms in the prediction of auto insurance business indicators with a large amount of data,and secondly provides useful enlightenment for the promotion of machine learning algorithms to other non-life insurance companies.In practice,it helps to improve the insurance company's ability to use the machine learning algorithm model and expand the insurance company's machine learning algorithm application business scope.It is useful to improve the risk management capability and product competitiveness of insurance companies'auto insurance business and improve the insurance company's digital transformation strategy.At the same time,it will promote the supervision department to formulate data sampling norms and standards for auto insurance business,propel the construction and sharing of insurance data,and give full play to the insurance risk management and guarantee functions.The structure of this paper is as follows:the first chapter is the introduction;the second chapter is the machine theory algorithm theory;the third chapter is the descriptive statistical analysis of the original data;the fourth chapter is the prediction of the UBI car insurance loss rate based on different algorithms;the fifth chapter is conclusion and suggestion.In the first chapter,it mainly introduces the current development and research status of vehicle network insurance,and briefly describes the main applications of machine learning algorithms at home and abroad in the analysis of auto insurance business.According to the background of China's commercial vehicle fee reform and the rapid development of insurance technology,the possibility and necessity of using machine learning algorithm to predict the indemnity rate of UBI car insurance is proposed.Based on the review of relevant research literature at home and abroad,the research methods,innovations and deficiencies of this paper are introduced.In the second chapter,the basic theory of machine learning algorithm model used in this paper is introduced from the perspectives of basic thought,basic model and basic method.It involves regression thought,decision tree model,direction derivative and gradient calculation method,gradient descent method and so on.Subsequently,the GLM model,the GBDT algorithm model and the LightGBM algorithm model theory used in this paper are explained,and the partial transparency of the black box of the machine learning algorithm is realized.Finally,from the model itself,the three models are compared.In the third chapter,the source and data structure of the original data are introduced firstly.Then the basic conditions of each dimension of the original data are analyzed.Finally,the original data collection provinces are analyzed,and the abnormal data is deleted and pre-processed.In the fourth chapter,the car insurance loss ratio is predicted by using different algorithm models,and the three models are compared and analyzed in many aspects based on the prediction accuracy.Specifically,it can be divided into the following modules:In the first step,the original minute-level sampling data is preprocessed with the user as the first reference and the trip as the second reference.While reducing the amount of data,the original 10 data dimensions are extended to 48,which refines the impact of each dimension of the original data.On this basis,the distribution of the dependent variable UBI car insurance loss rate is analyzed a priori,and the distribution that may be met is fitted and analyzed.At the same time,the training set and the test set are divided by random sampling,which account for 20%and 80%respectively of the preprocessed data.In the second step,according to the algorithm theory introduced in the second chapter,the GLM model,GBDT model and LightGBM model are established respectively for the prediction of UBI auto insurance loss ratio.In the model training part,for the machine learning algorithm model,~2 is used as the model training evaluation index.Using the grid search method,the preliminary parameters are first obtained through the training set training,and then the prediction set is used to control the over-fitting to determine the machine learning algorithm model parameters.For the traditional generalized linear model,the AIC value is used as the training evaluation index of the GLM model.According to the significant influence of the dependent variable on the independent variable,some dimensions are deleted to obtain the optimal model.In the part of the prediction of UBI car insurance loss rate using the model,the real value and the predicted value of some samples are randomly selected,and the prediction accuracy is analyzed by the true value of the X-axis and the predicted value as the Y-axis drawing.The MAE,MSE and RMSE indicators are used to evaluate the error between the predicted value and the true value of the model,and the negative rate of the predicted value is calculated.In the third step,the model prediction accuracy is the primary consideration.The three models are compared from the negative value of the predicted value,the efficiency of the model,the difficulty of the adjustment,the complexity of the model,and the retention of the original data dimension.The fourth step is to promote the applicability of the data returned by the hardware,the prediction of the UBI auto insurance business loss ratio,the prediction of the traditional auto insurance business loss ratio,the other indicators of the auto insurance business and other non-auto insurance business applications.In the fifth chapter,the research of this paper is summarized first.Then combined with the actual situation in China,from the perspectives of auto insurance business,insurance company and policy supervision,this paper proposes tentative suggestions for improving the ability of insurance companies to apply machine learning algorithms and speed up the digital transformation of insurance in China.
Keywords/Search Tags:Machine learning algorithm, UBI car insurance, loss ratio of Auto insurance
PDF Full Text Request
Related items