Font Size: a A A

Research On 5G Potential Customer Mining Based On Machine Learning

Posted on:2022-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2518306749467004Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
As an important role in promoting a new round of digital transformation,5G is also an essential infrastructure for future economic and social development.The country attaches great importance to the development of 5G and has incorporated it into the national strategy.To this end,a number of measures have been introduced to regulate the development of 5G.As the 4G era enters the critical window period of the 5G era,the three basic telecom operators,as the backbone of the 5G era,all hope to develop 5G users quickly and with high quality.However,some data show that my country's mobile phone user ceiling has already been established,and if there is no major change in the general environment,it is difficult to break through the high point of 1.6 billion users.Therefore,each operator can only improve the conversion rate of existing users.To this end,China Telecom has invested a lot of money to promote the development of 5G users.How to accurately identify potential 5G customers has become an urgent problem for China Telecom.This paper is based on the mobile user behavior data in a certain area,and the purpose is to find customers who may replace 5G packages in the future from the existing 4G stock customers.Therefore,this problem can be regarded as a binary classification problem.In the training phase,three months of user behavior data are selected as feature construction,and whether users will replace 5G packages in the next two months is used as a prediction window to predict users' willingness to replace.Since the development of 5G is still in its infancy,there are few 5G users,and the ratio of the scale of 4G stock users to the scale of 5G users is relatively large,so there is a phenomenon of data imbalance,and the imbalance ratio is 24:1.Therefore,the main goal of this paper is to establish a model for solving data imbalance,improve the recognition rate of 5G potential customers,and then help companies provide reference solutions for precision marketing.Aiming at the problem of data imbalance,we have studied in detail two levels of methods to solve this problem,namely the data level and the algorithm level.On this basis,a teacher-aided boosting algorithm for data imbalance,TABoost,is proposed.This algorithm uses the classification hardness distribution to undersample the majority class before the iteration of the boosting algorithm;the balanced subset obtained by sampling is used for the base classifier learning,in the iterative learning process,adding conditional feature disturbance to select features solves the problem of low computational efficiency caused by high data dimensions,and alleviates the overfitting phenomenon of the model to a certain extent;Secondly,considering that the first undersampling is a simple random extraction,and the learning ability of the base classifier in the early stage of model iteration is weak,so we trained a batch of heterogeneous classifiers in advance.The model is the judgment of the difficulty of sample classification during the first sampling and iteration process of the student model.In the early stage of the student model iteration,the teacher model dominates the judgment of the difficulty of sample classification.Finally,we conduct a detailed experimental analysis of TABoost and other methods to solve data imbalance on five public datasets.The performance is the best,and the effectiveness of each module of TABoost is verified by ablation experiments.On the 5G potential customer mining data set,based on our understanding of telecommunication services,the data set has been processed for missing values,outliers,and derived from special diagnosis,which will help improve the subsequent classification effect.In the model building stage,we will conduct detailed experiments and analysis on the 5G potential customer mining data set to solve the data level and algorithm level methods in the data imbalance,and compare the TABoost proposed in this paper with the representative of the two levels.The results show that TABoost performs the best on the four evaluation indicators of AUCPRC,F1-score,G-mean and MCC,among which AUCPRC is 0.512,F1-score is 0.694,G-mean is 0.711,and MCC is 0.512.is 0.703.Compared with the best model in other models,SMOTEBagging has improved by 2.6%,2.3%,2.7% and 2.6% in the four evaluation indicators respectively.Experiments show that the model constructed in this paper is effective and practical for accurately identifying 5G potential customers.
Keywords/Search Tags:5G potential customer mining, data imbalance, teacher thinking, integrated thinking
PDF Full Text Request
Related items