Font Size: a A A

Prediction And Analysis Of User Complaints On IPTV Dataset From Operators

Posted on:2019-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:R H LiuFull Text:PDF
GTID:2428330566995920Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
For the long-term commercial development of IPTV,ensuring the quality of user experience is the key to attracting user groups and increasing revenue for operators.It is also the key to the industry competition.Based on the status data collected by the IPTV set-top box and the data of user complaints,this thesis obtains the KPI dataset after data cleaning and matching.Then,aiming at the difficulties in the analysis and processing of the imbalanced KPI dataset,this thesis improves the existing machine learning models and algorithms from two aspects,and establishes a predictive model of user complaints to enhance performance.On the one hand,in order to select the most effective subset of features to remove redundant information and reduce the complexity of the model,this paper proposes a feature selection algorithm based on PCA principal component matrix.Specifically,in addition to considering the contribution of each original feature to the entire principal component,the contribution degree of the corresponding principal components and the information gain of the original features are also considered.A new algorithm for calculating the contribution of the features is proposed.Experimental results show that the proposed feature selection algorithm can further reduce the correlation between features and increase the accuracy of the subsequent prediction algorithms.On the other hand,aiming at the difficulties in building models with dataset,this thesis proposes an improved SMOTE algorithm to generate minority class data,and an undersampling method based on K-means++ algorithm to remove the redundant information in majority class data.Decision tree algorithm is chosen as the base classifier to establish the prediction model of user complaint based on the IPTV dataset from operators.Experimental results show that the improved SMOTE algorithm can effectively improve the prediction accuracy of IPTV prediction model when compared with the traditional Borderline-SMOTE algorithm.Moreover,the proposed under-sampling algorithm based on K-means++ algorithm can better remove the redundant information and improve prediction performance than the traditional random undersampling algorithm.
Keywords/Search Tags:IPTV dataset, user complaints, feature selection, under-sampling, over-sampling
PDF Full Text Request
Related items