Research On Fraud Identification Of Vehicle Insurance Based On Machine Learning

Posted on:2022-05-12

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Chu

Full Text:PDF

GTID:2518306311468894

Subject:Applied Statistics

Abstract/Summary:

For a long time,motor vehicle insurance is the largest business of property insurance in China.However,statistics show that about 20%of auto insurance claims contain the possibility of fraud,and less than 3%of suspected fraud are prosecuted.Under the background that the auto insurance reform reduces the premium income,insurance companies can maintain their benign operation and increase their competitiveness by reducing insurance expenditure.It is of great significance to improve the identification rate of insurance fraud and accurately crack down on it for reducing the cost of insurance premium.Compared with manual identification,the identification of insurance fraud by using machine learning model has the advantages of more cost saving and higher accuracy.This paper attempts to introduce a single statistical model and machine learning model into the identification of auto insurance fraud,and uses Stacking technology to integrate multiple models,and obtains a more stable model with higher prediction accuracy.Firstly,this paper summarizes the basic concepts of machine learning and the basic theory of the model used in this paper.Next,data preprocessing methods such as data cleaning,data transformation and feature selection are applied to the Kaggle public data set,which can reduce the running time and improve the prediction accuracy for the subsequent establishment of the fraud identification model.Secondly,taking the preprocessed data as the input,the insurance fraud identification models based on Naive Bayes,SVM,Adaboost and KNN models are constructed respectively,and the parameters are adjusted to obtain better performance.Using the evaluation method of classification model to compare and analyze the prediction results,among which KNN has the best classification effect for the whole,while naive Bayes has the worst prediction ability on the whole,but its prediction for fraud samples is the best among the four models.Finally,the paper introduces the basic flow of Stacking technology and uses Stacking technology to fuse the four models to get a new auto insurance fraud identification model.This model combines the advantages of high stability of statistical model and high prediction accuracy of machine learning model.and its prediction accuracy for the whole sample and fraud samples is significantly improved compared with that of a single model.

Keywords/Search Tags:

Machine learning, Adaboost, Stacking, Feature selection

Related items

1	Research On Optimization Algorithms Of Stacking Classifiers
2	Stock Price Prediction Research Based On Feature Selection And Improved Stacking Algorithm
3	Credit Default Detection Based On Deep Heterogeneous Stacking Model
4	Prediction Of Students' Academic Level Based On Feature Selection And Stacking Framework
5	Research On The Risk Control Model Based On Machine Learning Algorithms
6	Research On IPTV Users’ Complaint Prediction Algorithm Based On Machine Learning
7	Stacking Ensemble Learning Algorithm Based On Intrusion Detection
8	Research And Application Of Integrated Feature Selection Algorithm Based On Extreme Learning Machine
9	Research On Feature Selection For Machine Learning
10	Research On E-commerce Purchase Behavior Prediction Based On Feature Selection And Stacking Integrated Algorithm