Font Size: a A A

Research On 5G Potential Customer Identification Based On Data Mining

Posted on:2022-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:B M WuFull Text:PDF
GTID:2518306770478454Subject:Information and Post Economy
Abstract/Summary:PDF Full Text Request
Since China Telecom,China Unicom,and China Radio and Television were officially granted 5G commercial licenses by the Ministry of Industry and Information Technology in2019,China has entered the 5G era.With the gradual improvement of random 5G infrastructure,the competition among the three major operators has become fierce.Whoever can accurately identify potential 5G users will be able to seize market opportunities and have a competitive advantage.Combined with the predictability of customer behavior in the era of big data,mining user data through data mining technology can improve the marketing efficiency of operators.Therefore,this paper conducts in-depth research on 5G package data from Chongqing through data mining technology.Establish a 5G potential customer identification model,look for factors that affect the use of 5G packages,and give targeted recommendations.The dataset in this paper contains 43 features in 8 dimensions,of which the dependent variable is whether it is a 5G package user or not,which is a binary classification problem.The main processes of model building in this paper are: data preprocessing,feature engineering,model training and evaluation.Firstly,through data preprocessing,the problem of inconsistency between missing values and dimensions in the data set was solved,and derived features were constructed.Then,based on the GBDT model,the data set was dimensionally reduced,and the20 most important features were screened out.In order to make the model learn the data better,the data set was upsampled by the SMOTE algorithm to solve the problem of data imbalance,and then four models such as decision tree,random forest,Adaboost and Xgboost were established.Accuracy,recall rate,F1 value,AUC and other indicators were used to obtain the importance of the optimal model and features.The importance of the features of the optimal model was further studied through the SHAP value,and targeted suggestions were given.The results show that in terms of model results,the learning effect of the Xgboost model is the best,and its AUC value reaches 0.80.The model can better distinguish 5G package users from non-5G package users.Based on the optimal model,the factors that affect the use of 5G packages include whether they are home users,average discounted consumption in the past three months,discounted consumption in the current month,whether they are home users,user main tariff package,user total package value,etc.And further explain how the features are affected by the SHAP value.Specifically,the willingness to use 5G packages shows a trend of first increasing and then decreasing with the increase in the average discounted consumption level in the past three months;the willingness to use 5G packages is the strongest among the age range of 20-25.Based on this,reasonable pricing of 5G packages is given,and key groups such as young people and group users are targeted,and operators are encouraged to provide more 5G package options.
Keywords/Search Tags:Data mining, 5G, Lubber identification, Xgboost
PDF Full Text Request
Related items