Research On IPTV User's Complaint Prediction Strategy Based On Imbalanced Data Processing

Posted on:2021-01-05

Degree:Master

Type:Thesis

Country:China

Candidate:C B Chen

Full Text:PDF

GTID:2428330629487216

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of multimedia communication technology,Internet Protocol Television(IPTV)using the Internet as the carrier and the operator's dedicated broadband network as the transmission medium has developed into the core of the digital home and has great marketing value.With the continuous development of IPTV business,in order to prevent the loss of customers,operators urgently need to accurately predict the users who reported faults to improve services in advance and increase user viscosity,so as to achieve proactive operation and maintenance.Machine learning is an important method to achieve intelligent prediction.In practical engineering applications,the high-dimensional imbalance of actual data will seriously affect the prediction accuracy of traditional algorithms.Therefore,the processing method of high-dimensional imbalanced data is a key problem that needs to be solved urgently in machine learning and related application fields.This paper mainly studies related algorithms for high-dimensional unbalanced data,and applies the theory to IPTV users' complaint prediction.The research work is mainly carried out from the following three aspects:a)In view of the problem that the classification model of high-dimensional unbalanced data is too complicated and the classification effect is not good,with the help of the integrated strategy framework of the Bagging algorithm and the fast correlation-based filter feature selection algorithm based on normalized mutual information,This paper proposed an improved feature selection algorithm(IFCBF_NMI)which is oriented imbalanced data.It realizes the effective dimensionality reduction of the features and reduces the bias of selected features to the majority class.b)Aiming at the problem of poor classification performance of Relevance Vector Machine(RVM)on imbalanced data sets,an imbalanced data classification algorithm(LFOA-HSRVM)based on parameter optimization and mixed sampling is proposed.In the classification process,the LFOA-HSRVM algorithm adopts a Hybrid sampling strategy based on RVM's relevance vectors to reasonably change the data distribution,and uses the double subgroups fruit fly optimization algorithm with characteristics of Levy flight to search for the optimal solution of the RVM kernel parameters,which overcomes the problem that the decision boundary of the traditional RVM algorithm is biased to the minority class when dealing with imbalanced data sets,and greatly improves the classification performance of the data.c)Aiming at the problems of numerous complaint factors and relatively poor fault samples of IPTV users,the proposed IFCBF_NMI feature selection algorithm and LFOA-HSRVM classification algorithm are applied to IPTV users complaint forecasting research.A prediction model of IPTV users' complaint is established,which integrates data preprocessing,feature extraction and classification prediction.The prediction model has stable performance and high accuracy.

Keywords/Search Tags:

imbalanced data sets, feature selection, relevance vector machine, hybrid sampling, nuclear parameter optimization, IPTV

PDF Full Text Request

Related items

1	Feature Selection And Classification For Imbalanced Medical Data
2	Research On Classification Method Of Imbalanced Data Sets
3	Hybrid Support Vector Machine Model For Intelligent Decision Of High Uncertain Data Sets
4	Research On The Classification Of Imbalanced Data Sets Based On R-SMOTE
5	Study On Imbalanced Data Sets Classi-fication Method And Its Application In Telecommunication
6	Rule Extraction For Imbalanced Data Classifica- Tion Based On SVM And Its Application In Commercial Bank Failures Prediction
7	Classification Algorithm Of Unbalanced Datasets
8	Research Of Parameter Selection For Support Vector Machine
9	Text Classification Algorithm Based On Imbalanced Data Sets
10	The Research And Application Of Diverse AdaBoost Relevance Vector Machine In Distributed Environment