Research On The Prediction Method For Imbalance Data Set

Posted on:2016-12-17

Degree:Master

Type:Thesis

Country:China

Candidate:S B Li

Full Text:PDF

GTID:2298330467488408

Subject:Software engineering

Abstract/Summary:

Imbalance data set is a difficult problem in data mining, and has very highpractical value. Therefore, get the wide attention of scholars in recent years, manyresearch results were published in the journal, the high level meeting. It is a trueobservation data form generally exists in many fields, real and objective descriptionof the nature of some things, a small part is of concern, but this part of the data isoften hidden by a large amount of data, resulting in the classification of difficultproblem. The classification of imbalance data set is a difficult problem in the field ofdata mining, the problem processing commonly used classification strategy from thetraditional classification problem is not very good for this problem, has aroused greatattention of countries experts and scholars all over the world. Being an important partof customer relationship management, Customer churning management became anindispensable part in modern enterprise management. In recent years, transmissiontechnology and Internet technology, Distributed DataBase Management System(DDBMS) was widely used in enterprise. The ample data of enterprise provided thenecessary conditions to the customer churning forecast based on data mining.Customer churning forecast system, being an important part of enterprisemanagement analysis system, sets up the customer churning forecasting models andfinds the potentially lose customers, in order to take measures and save and reducethe occurrence of customer churn. Therefore, the study on the customer churnforecast has become a hot research topic for its important significance on theimprovement of enterprise competitiveness.This thesis firstly introduce the conception of imbalanced data set and theprogress of imbalance data classification problem that is being studied by experts andscholars in the world, and it explains the reasons why imbalance dataclassification problem is so difficult to work out, the treatments we often adopt aboutthis problem, and the evaluating metric of classification performance. The dissertation analyzes the application of data mining technology in customer churning,the characteristics of enterprise customer churning data and the great influence ofnetwork society to modern enterprise according to the present situation of customerchurning research development. It also puts forward the â€œlogistic regressionprediction model based on Clustering Stratified samplingâ€ and â€œcustomer churningwarning methods based on the analysis of network public opinionâ€. And the latteralso contains the parameter estimation based on logistic regression of Stratifiedsampling and bias compensation method of parameter estimation.

Keywords/Search Tags:

imbalanced data, customer churn prediction, stratified sampling, logistic regression

Related items

1	Telecom User Churn Prediction Based On Imbalanced Dataset And Customer Segmentation
2	Imbalanced Data Mixed Sampling Algorithm And Its Application In Customer Churn Prediction
3	The Research And Application Model About Churn Prediction For Mobile Customer
4	The Study Of Customer Churn Prediction Based On Data Mining
5	Application Of Data Mining In Analysis Of Customer Churn In Telecom Industry
6	The Analysis Of High Data Traffic Telecom Customer Churn Prediction And Retaining Opportunity Assessment Based On Data Mining
7	Customer Value Analysis And Loss Warning Based On Data Mining
8	Research On Forecasting Analysis Of Shifting Customer Churn Based On Data Mining
9	E-commerce Customer Churn Prediction Based On Customer Segmentation
10	Prediction Of High Value Customer Churn In Commercial Banks Based On Machine Learning