Font Size: a A A

Resarch On The Application Of Machine Learning Algorithm In Commercial Endowment Insurance Parchasing Behavior

Posted on:2022-06-01Degree:MasterType:Thesis
Country:ChinaCandidate:S Y GuoFull Text:PDF
GTID:2518306314460764Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In the era of the rise of big data,if insurance companies can make effective and reasonable use of existing data to mine more potential information,it can not only improve work efficiency,but also achieve precise marketing to customers,thus saving unnecessary expenses.At the same time,China attaches more and more importance to commercial endowment insurance,so it is of great significance to use customer information to explore the purchase behavior of commercial endowment insurance.In this paper,we mainly study the application of different machine learning classification algorithms in identifying customers of unbalanced insurance companies.At the same time,it analyzes the factors affecting the purchase of commercial endowment insurance from the perspectives of individuals and families,which provides a reference for insurance marketing.This article first to the machine learning algorithms,unbalanced data processing method in theory,choose 2017 Chinese general social survey data as the sample data,this paper will be whether to buy commercial endowment insurance set as the target variable,which is kind of sample to buy,the negative samples for not buying,after data preprocessing,the feature variables were screened,and the 10 screened variables were finally used in the modeling.Since the number of positive samples was much larger than that of negative samples,this was a dichotomy problem with unbalanced samples.Next to the data divided into training set and test set,respectively on two data sets to establish logistic regression,naive bayes,decision tree,support vector machine and integrated learning model,and according to the classification evaluation index to evaluate the result of the classification model.The logistic regression and naive bayesian model for negative samples recognition effect is very poor,C5.0 decision tree and support vector machine have high recognition accuracy for negative class samples by introducing misclassification cost matrix,Bagging integration also solves the problem of sample imbalance by integrating the CART decision tree.Finally,according to the decision tree,the importance of variables obtained from ensemble learning and the classification rules of the decision tree,some suggestions and strategies are provided for insurance marketing.
Keywords/Search Tags:Misclassification Cost, Decision Tree, Ensemble Learning, Data Mining, Customer Identification
PDF Full Text Request
Related items