Research On Item Selection And Customer Clustering Algorithms In Direct Marketing

Posted on:2008-02-24

Degree:Master

Type:Thesis

Country:China

Candidate:X A Xu

Full Text:PDF

GTID:2178360212997458

Subject:Computer application technology

Abstract/Summary:

In recent year, as a new marketing model without booth, Direct Marketing has attracted the close attention. In this model, with the help of a large volume of customer feedback data, proprietor can make the optimized selection of marketing product and the reasonable location of target customers. Therefore direct marketing model is becoming a trend in more and more fields.In direct marketing, the so-called"direct"means that enterprises make direct contact with customers through various channels; the so-called"feedback"means that the customers who received the information would make response, feedback their buying desire to the enterprises, and the enterprises preserve the feedback information to provide a basis for the target market positioning and business decisions making in the future. Thus it can be seen that the two most important elements in direct marketing is the choice of business information and the positioning of target market, i.e. item selection and customer clustering. And these two aspects are the place exactly where data mining can give full play to its effectiveness. So using data mining theory to research in these two aspects is of great significance.In this paper, we have made profound research dividedly on the two pivotal elements of direct marketing: item and customer. Using data mining theory, we give corresponding solution for each other, and at last we put forward a new algorithm base on the consideration of both item and customer at the same time, it can find the reasonable mapping between items and customers. The content of this paper including:(1) In the selection of item, we find that the profit of one item not only comes from its own, but also form the influence of correlative items, namely the so-called"cross-selling". Through the depth study on this influence, we discover some certain similarity between item and website; especially both of them use certain stochastic model to make the choice of next element. So we transplant the classical algorithm of website link analysis to item selection, and previous studies have shown this strategy is feasible. Base on the Google website analysis algorithm"PageRank", this paper puts forward a new algorithm called"ItemRank"to solve the item selection with the consideration of cross-selling effect. The algorithm gives an individual customer purchase model base on the stochastic surfing model of PageRank. This model defines the behavior characters when a customer is purchasing a series of items, and uses the weight transfer strategy to give the calculation method of each item's ItemRank weight. At the same time, this algorithm uses association rule to simulate the cross-selling effect between items, and uses the confidence of association rule to simulate the link strength, thereby complete the modeling of cross-selling with item profit. ItemRank algorithm is performed on IBM standard synthetic dataset; the result shows that our algorithm can solve item selection with the consideration of cross-selling and can get better outcome.(2) In the customer clustering, after research on previous study, we find that in direct marketing model, there are some difficulties in customer clustering. Including: 1, there is extreme imbalance in data. The proportion of respond customers is so lowness that many algorithms which are based on statistical analysis can not perform. 2, the dataset is very high in dimensionality, but very scarce in meaningful, so cause the"dimension tragedy".3, the last but the most difficult to deal with is that there is often an inverse correlation between the likelihood to purchase and the amount of the transaction. It indicates that the more amounts is involved, the scarcer such transactions happen, so a pure statistical based method tends to rank such transaction down, and ignore"big customers". According to these problems, we give a customer clustering algorithm called"BIVCCluster"which is based on individual value. This algorithm makes a creative use of association rule, proposes the notion of"focused association rule"to solve the imbalance in data; at the same time, it pushes the customer value into the selection of association rules, and chooses"covering rules"from the selected association rules to establish a tree model, in order to predict potential customer value. The result on KDD98 dataset shows that our algorithm has significantly advantage to other algorithms.(3)At last, taking into account the indivisibility of item and customer in direct marketing, we put forward a new algorithm on the basis of classical catalog segmentation theory, which is called"A Dual Segmentation Algorithm based on Cross-Selling with Customer- Oriented","DualSeg"for short. DualSeg algorithm can find reasonable mapping between items and customers. In this algorithm we seek for k item catalogs of size r with the consideration of cross-selling, then get the corresponding customer segmentation of each catalog, and achieve the whole profit maximization. At first the algorithm explains the meaning of"customer-oriented", and uses this to solve the customer resource waste in previous algorithm, then based on the catalog segmentation theory the model design is given, and describes the concept of"k-Maximum Element Weight Cover With t"; for the cross-selling in items, it gives a cross effect factor"csfactor"based on association rule, and uses confidence of a special association rule which is called"loss rule"to simulate the profit loss when some items are not involved in catalog, and hereby gives the catalog item selection sub-algorithm. At last DualSeg puts forward the greedy algorithm of whole problem. The experiment proves our algorithm can get better result.

Keywords/Search Tags:

Clustering

Related items

1	Research On Ensemble Clustering Algorithm Based On Bilateral Clustering
2	A Study Of Large-scale Data Clustering Based On Fuzzy Clustering And Its Application
3	The Research On The Method To Measure The Validity And To Abstract Knowledge Of Clustering
4	Research On Hybrid Algorithm Based On Subtractive Clustering
5	Research On Dynamic Clustering And Incremental In Data Mining
6	Study On New Data And Text Clustering Methods Based On Representatives
7	Research On Key Technology Of Clustering Analysis Optimization
8	Research On Hybrid Ant Colony Clustering Algorithm
9	The Study And Application Of New Clustering Algorithms In Image Processing And Text Clustering
10	The Research On Clustering Technology For Big Data