Font Size: a A A

Application Of Data Mining In The Purchase Behavior Of Residents' Medical Insurance

Posted on:2020-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:S S WangFull Text:PDF
GTID:2404330620957272Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In recent years,big data has been more and more applied in various fields,while it is less applied in the field of insurance.Insurance companies are eager to spend the least re-sources to find the most valuable target customers,while data mining technology can extract useful potential information for insurance companies from a large amount of information and transform the traditional marketing model.In the above background,the paper mainly studies the application of data mining al-gorithms in identifying unbalanced insurance company customers,and analyzes the factors that affect residents' purchase of commercial medical insurance in China,so as to provide references for insurance companies in product marketing.Based on the data of China com-prehensive social survey(CGSS)in 2015,14 variables are selected as input variables accord-ing to relevant data,and the empirical analysis is conducted on whether to buy commercial medical insurance as output variables.Among them,residents who buy commercial med-ical insurance were taken as positive samples,while those who do not buy were taken as negative samples.The number of positive samples is far less than that of negative samples.Firstly,the traditional decision tree classification model is used for empirical analysis,but the classification effect is not ideal and the positive example samples are basically unable to be identified.Then,the decision tree model with misclassification cost is used for empirical analysis,and it is found that the recognition ability of the model for positive example sam-ples is significantly improved after the inclusion of misclassification cost,but it is still quite different from that of negative example samples.Then,the cost sensitive decision tree model is used to carry out empirical analysis,that is,the decision tree model with both misclassi-fication cost and test cost was considered.The results show that the model's recognition accuracy for positive samples is improved,which is basically equal to the recognition ability of negative samples.Finally,weighted random forest(WRF)modeling is used for empirical analysis,and it is found that the weighted random forest used to identify a small number of positive samples also have a high accuracy.The empirical results show that the cost-sensitive decision tree model and the weighted random forest model can well solve the problem of sample imbalance,accurately predict the purchase behavior of commercial medical insurance of Chinese residents,and effectively identify high-value customers.The paper finds several characteristics that have the greatest impact on whether Chinese residents purchase commercial medical insurance,and provides advice for insurance companies to improve their performance by accurately identifying high-value customers.
Keywords/Search Tags:Cost sensitive, decision tree, weighted random forest, data mining, customer identification
PDF Full Text Request
Related items