Font Size: a A A

Research On User Analysis And Behavior Prediction Driven By Big Data

Posted on:2021-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y F DengFull Text:PDF
GTID:2518306308968809Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
As we all know,in customer relationship management,finding potential churn customers in advance and taking personalized countermeasures have become an important proposition for the sustainable development of enterprises.There have been actively exploring churn prediction solutions supported by data mining technology,and have achieved remarkable research results.However,in areas such as churn prediction,fraud identification,disease detection,and accident monitoring,imbalanced data occurs from time to time,which may prevent most algorithms from learning correctly,the relevant works are mainly focused on improving the precision of classification models,ignoring the role of customer segmentation.Even if the influence of customer segmentation is taken into account,only the customer transaction data is used for quantitative research,the explanation of the customer segmentation problem is not enough,and it has little significance for subsequent customer churn prediction.Therefore,this paper carries out a joint research on customer segmentation and churn prediction,besides,introduces two major guarantee systems running through the whole process of research:"data mining technology" and"scientific evaluation indicators".The mainly tasks are as follows:(1)Researched data processing technology,customer segmentation method and churn prediction method.Analyzed data management,data integration,data cleaning,distance measurement,data transformation,data segmentation,imbalanced data processing,feature extraction,feature selection,and machine learning algorithms.(2)Solved the problem of bank customer segmentation.Promoted the application,deepening and verification of the dual clustering algorithm with adaptive weights.First,combined with data preprocessing methods such as One-Hot and Normalizer,use PCA to cluster and merge attribute variables,automatically adjusts feature projection coefficients,which can solve problems such as information damage,multi-collinearity,and data redundancy that may exist in existing research.Then,use K-Means++to subdivide the customer categories,and introduce Inertia and Silhouette to determine the optimal number of clusters.The model performs well and solves complex problems such as fragmentation of customers and information overload.(3)Designed an bank customer churn prediction solution based on customer segmentationA heterogeneous integration algorithm G_R_L_D is proposed,which is applied to the customer churn prediction model based on imbalanced data,the precision is 79%,which is better than most other algorithms.In addition,SMOTE-ENN sampling is used to effectively handle the global imbalance characteristic of bank customer churn data,improves the recall while maintaining high precision,and further improves the accuracy of the G_R_L_D and its base classifiers by more than 3%.Finally,unilateral SMOTE-ENN deviation correction sampling is proposed to effectively deal with the local imbalance characteristic of bank customer churn data,and personalize modeling for different customer groups,the precision is optimized by 1%,and the other performance indicators improve slightly.
Keywords/Search Tags:data mining, churn prediction, imbalanced data, multistage clustering, unilateral deviation correction sampling
PDF Full Text Request
Related items