With the acceleration of economic globalization and internationalization,there is a sustained growth in the scale of outbound users,and the outbound service market has ushered in good development opportunities.At the same time,the market competition has become increasingly intense.In sharp market competition environment,realizing competitive differentiation is an important way for outbound service companies to enhance their competitiveness.As the core of the outbound service market,accurate identification of pre-outbound users plays a decisive role in the precise launch of outbound service products and efficient exit-entry administration.The development of big data,data science and technology has greatly promoted the research on precise mining of specific users in the industry.In summary,this thesis takes users' mobile terminal information interaction data as a breakthrough point,mines the features of preoutbound users from the big data of telecom operators and proposes algorithms to accurately identify pre-outbound users.This thesis used deep packet inspection technology and crawler technology to collect the parsing information of application's data packet,outbound service phone information,and the base station information of outbound service agencies.Therefore,a behavior analysis reference field database was constructed.Furthermore,the big data of telecom operators was matched with reference field database to mine potential target users and extract their spatiotemporal behavior features and attributes features.This thesis proposed a correlation metric coefficient based on maximum information coefficient and symmetric uncertainty.Therefore,a mobile users' outbound feature selection algorithm was constructed,which take the advantages of Fisher rules,metric coefficient and Markov-Blankets judgment conditions.First,a multilayer classifiers integration and features fusion based method was proposed to identify pre-outbound users.The method first finds the optimal model parameters by using Bayesian optimization algorithm,then uses the second level classifiers to further learn the output features of the first level classifiers,and then fuses the output features of first level classifiers and second level classifiers to construct interactive features.Finally,the highly efficient gradient boosting decision tree algorithm was used to further fit the interactive features for learning and prediction.Secondly,a multilayer preceptron network based method was proposed to identify pre-outbound users.The method first uses a feature mapping functions to construct nonlinear features,then finds the weight coefficient of multilayer perceptron network by using stochastic gradient descent algorithm.Finally,optimal multilayer perceptron network model was used to further fit the fusion features for learning and prediction.Finally,the test sample set was used to verify the two identification algorithms proposed in the thesis.The experiment results show that the accuracy of the first identification algorithm is 97.41%,the accuracy of the second identification algorithm is 97.54%.Both the identification algorithms proposed in the thesis are significantly better than the single model,the voting fusion model,and the two-level stacking fusion model.In summary,the two identification algorithms proposed in the thesis learn features better,greatly improve the classification performance,and the second algorithm is better than the first algorithm.As the best algorithm,the second algorithm could be used to accurately identify the pre-outbound users. |