Font Size: a A A

Telco User Activity Level Prediction With Massive Telco Spatiotemporal Data

Posted on:2019-06-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:C LuoFull Text:PDF
GTID:1368330548473215Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Telco user activity level prediction plays an important role in customer retention sys-tem.Knowing the changing behavior patterns of users through their activity levels in ad-vance can be a great help for operators to improve user experience and profit efficiency.In the thesis,based on unique spatiotemporal data of telco operators,we carry out a research on a data-driven method for telco user activity level prediction.Focus on three challenges,such as modeling data variety,co-occurence model efficiency and improvement on spatiotempo-ral granularity of data,our research proposes an effective solution.The contributions are three-fold:1.Based on the variety attributes of telco spatiotemporal data,we develop an novel feature engineerinig method.Most of the existing methods only focus on data sources from business support system(BSS)with basic feature engineering techniques.In contrast,our method not only consider BSS data,but data from operation support system(OSS)and other related data such as POIs and road network.We leverage sophisticated feature engineering techniques,such as statistical techniques,PageRank,Label Propagation,K-Nearest Neighbor,DBSCAN,Spectral Clustering and Factorization Machines,to extract features from multiple aspects like user demand,user feedback,quality of service,social relationship and user behavior.The results show our method outperforms the state-of-the-arts.When predicting the top 50000 users,the results can reach 0.649,0.126,0.211,0.803for precison,recall,F1-score and AUC in predicting active users,and 0.549,0.194,0.287,0.735 in predicting inactive users.Comparing with the best performance of previous work.The results improve 7.3,1.5,2.5,4.7 percentage for precison,recall,F1-score and AUC in predicting actives users,and 8.4,2.9,4.3,3.8 percentage in predicting inactive users.2.Due to the important role of co-occurence patterns,we propose a more effec-tive and efficient online LDA method OEM.Since traditional methods scale badly when dealing with big co-occurence matrix,we propose an novel online LDA algorithm named OEM.Unlike the state-of-the-arts such as OGS,OVB and SCVB,the objective of OEM is to optimize the posterior of doc-topic distribution and topic-word distribution.Also,in each iteration OEM can achieve exact inference by reaching tight lower bound.Therefore,OEM has lower perplexity and faster convergence speed.We concatenate topic features extracted by OEM with features from the first contribution.The results show that the performance improves 2.4,0.4,0.7,1.8 percentage for precision,recall,F1-score and AUC in active user level prediction,and 2.8,1.0,1.4,1.6 percentage for precision,recall,F1-score and AUC in inactive user level prediction.3.We propose a localization method to improve the granularity of user spatiotem-poral data.As the main carrier of user spatiotemporal behavior,the business records have several disadvantages.First,a record is added only if a customer uses telco service.There-fore,it is difficult to analyse user spatiotemporal behavior when the service is not used.Second,user locations are represented by limited base station IDs.To this end,we propose a novel localization method over massive users to get user spatiotemporal data of higher and uniform sampling rate with better spatial representation ability.First,we use a novel automate labeling method with OTT data and road network,then infer user location using a novel context-aware coarse-to-fine regression method(CCR).Experiments show our method achieves 80 meters in median error,outperforming the state-of-the-arts,such as range-based method(202 meters),Cell~*(157 meters)and Fingerprinting(95.7 meters).Based on user location data,we extract new features for user activity level prediction,and concatenate them with features from the first two contributions.The performance again improves 3.6,0.7,1.2,2.6 percentage for precision,recall,F1-score and AUC in active user level prediction,and4.1,1.5,2.2,2.1 percentage for precision,recall,F1-score and AUC in inactive user level prediction.The final results are 0.709,0.137,0.230,0.847 for precision,recall,F1-score and AUC in active user level prediction,and 0.618,0.219,0.323,0.772 for precision,recall,F1-score and AUC in inactive user level prediction.
Keywords/Search Tags:User Activity Level Prediction, Topic Modeling, Spatiotemporal, Localization
PDF Full Text Request
Related items