Font Size: a A A

The Prediction Of User Outage Hehavior Of Telecom Operators

Posted on:2018-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q HuangFull Text:PDF
GTID:2428330596989143Subject:Computer technology
Abstract/Summary:PDF Full Text Request
It has been nearly 20 years since the reform of the communications industry in 1998.The competition in the communications market is becoming increasingly fierce and the market is growing saturated,as a result,the fighting for customers and regular customers retention for the operators is essential.The user off-grid warning and detention has become an important task.The thesis carries out data mining in accordance with DRISP-DM,i.e.cross-industry standard process for data mining,and predicts the user's off-grid behavior through such five steps as business understanding,data understanding,data pre-processing and model building pre-evaluation.With the historical data of an operator in Shanghai as the object of study,firstly,it extracts data from the various subsystems of the BOSS system,cleans and integrates the data into a wide table,and reloads it into the OLAP platform,through which preliminary exploration is applied for the data.Through such preliminary exploration of the data,it defines and explains relevant data content to seek appropriate data mining models.By comparing the advantages and disadvantages of Logistic Regression,Neural Network,Bayesian Analysis,Decision Tree,and Random Forest,it concludes that the Decision Tree and the Random Forest are suitable for this prediction analysis.It builds models comparing the two methods of the Decision Tree and the Random Forest through the data mining tool to predict user off-grid.In the process of data mining,it carries out dimensionality reduction for the discrete values according to PCA linear variation,and it analyzes and optimizes the leaf node size of the Random Forest,sub-model quantity,characteristic groups,and the threshold value of score in line with the numerical evaluation index,ROC curve,etc.of the research model to further improve the accuracy and speed of prediction for the model.By predicting the historical operation data of the operator,it concludes that the recall ration of the C5.0 Decision Tree is about 62.1% and the precision ratio is about 45%;and when using the Random Forest with tuned parameters and the data of the same month to predict user off-grid,the recall ratio is 79.8% and the precision ratio is 84.2%,and if it uses the data of the current month to predict user off-grid next month,the recall ratio is 71.2% and the precision ratio is 70.1%,which enjoys an obvious increase compared with the Decision Tree,so it has reached the expected result.The thesis applies the popular mining models at present combined with the self-evident features of the communications industry to predict user off-grid of telecom operators,so that relevant business units can develop detention measures more effectively and reduce detention costs.
Keywords/Search Tags:Data mining, Rendom Forests, Customer Defection
PDF Full Text Request
Related items