Font Size: a A A

Statistical Models For Customer-base Analysis

Posted on:2015-09-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:F LiuFull Text:PDF
GTID:1109330434466091Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
It becomes of crucial importance for companies to fully exploit their existing customer base, such as predicting the probability of customer churn, forecasting future transactions both in individual level and aggregate level, and, in turn, cal-culating the customer lifetime value. Such analysis assists firms in making market-ing decisions, which include:(1) whether new customers should be absorbed;(2) whether an existing customer should be ’fired’; and (3) how to allocate the mar-keting resources. Nowadays, the development of information technology makes it much easier for customer data collection and processing. The cost of conduct-in customer-base analysis is decreasing, which also promotes the application of customer base analysis.Plenty of statistical models have been built, of which the influential ones are reviewed in detail. In this thesis, I provide two extensions, focusing on two prob-lems that are lack of attention. First, the week effect of the purchasing behavior is captured through a hierarchical model. In many industries, customers are ob-served to visit weekly, showing a periodicity in purchasing the products or services. Through an empirical study, I show that the probability to buy is higher when the interpurchase time is closer to the multiple of7days. Thus, instead of modelling the interpurchase time directly, I divide it into two parts, the week componen-t and the day component, where the first part is modelled by two methods, a gamma-mixed Poisson distribution and a beta-mixed negative binomial distribu-tion, and the second part is modelled by the logistic regression. The gamma-mixed Poisson distribution is actually the negative binomial distribution, thus I name the method with Poisson week as the NBD-logit model and the other method as B-NBD-logit model. The defection process is the same as the one nested in the MBG/NBD model. I apply these two methods on a data which records the sales of a new product called " Kiwibubble". Compared with the MBG/NBD model, our methods, especially the B-NBD-logit model, perform better in tracking the cumulative and weekly sales both in the calibration period and prediction period.The second problem is that the heterogeneity assumption nested in the famous customer-base-analysis models (i.e., the Pareto/NBD model, the BG/NBD mod-el, etc.) may not be proper. Usually, the distributions that are used to quantify the heterogeneity across the customers are gamma distribution, log-normal dis-tribution, beta distribution and so on. There is only one local maximum point of each distribution in its domain, which means that the frequency of the customers increases as their characters become closer to the mode and decreases as they per-form diversely from the mode. However, the composition of the customers may be complex so that the single-peak distribution may not be enough to capture the heterogeneity across the cohort. It is demonstrated by a simulation study, where the data is combined by two groups of customers and the purchasing settings are homologous of the BG/NBD model. In this thesis, I modified the BG/NBD model and assume the purchasing speed follows the mixture gamma distribution. The new models are applied to the CDNOW data where the BG/NBD model is first applied. The results show that our model performs better while keeping the merits of the original model.
Keywords/Search Tags:customer-base analysis, statistical modeling, customer lifetime val-ue, Monte Carlo simulation, maximum likelihood estimation, mixture distribution
PDF Full Text Request
Related items