| With the rapid development of the Internet,the public has greatly transformed the way of shopping,the analysis of large-scale e-commerce APP data has high commercial significance and research value.For mobile operators,how to quickly and effectively identify the behavior categories of e-commerce APP users in large-scale mobile data and how to mine more valuable information from user traffic information has become an important research subject.However,the traditional method of manually identifying and marking user behavior data based on knowledge engineering is time-consuming and laborious,which is not applicable to the current situation of increasing number of APPs.Therefore,the core research work of this paper is based on mobile DPI traffic data to realize the automatic identification and purchase behavior prediction of mobile e-commerce APP user behavior.Specifically,the main contents of this paper are as follows:1.Collect mobile DPI data and complete the preprocessing of e-commerce APP user dataFirstly,extract the URL information of the domestic mainstream e-commerce APP from the mobile Internet traffic,generate a regular expression file,and complete the traffic rule identification.Secondly,based on the MapReduce framework on the Hadoop platform,traffic rules are matched on the original mobile DPI data to filter out the mobile e-commerce APP user data set.2.Proposed URL-based e-commerce APP user behavior automatic identification methodAiming at the large-scale URL data in e-commerce APP user data set,a URL-based automatic identification method of e-commerce APP user behavior is proposed.This method uses six different feature extraction schemes,respectively,Baseline,eliminate the differences between case(CaseE),based on the component URL information(Compo),based on the URL component length information(Length),based on Bi-,Tri-grams(BiTri)and combination words segmentation(Seg).Naive bayes,support vector machine,logistic regression,decision tree and random forest five different machine learning algorithms are used to construct the multi-classification model,and the experimental results show that the accuracy of the automatic identification method is more than 75%.3.Propose the method of purchasing behavior of e-commerce APP users based on DPI dataFor mobile DPI data,from the user angle mining can represent the characteristics of the user e-commerce APP buying habits.Combined with the results of URL-based automatic identification of the user behaviors of e-commerce APP,propose a sliding window-based method for predicting user purchase behavior.The experiment proves that the proposed method for predicting the user purchase behavior has a good effect,and mining new user behavior features can significantly increase the accuracy of prediction. |