Font Size: a A A

Research On Customer Segmentation Clustering And System For Big Dataset Of Retail Business

Posted on:2017-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y B WangFull Text:PDF
GTID:2349330503481940Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Customer segmentation is a strategy to devide the customers into different groups, in which customers in the same group have similar shopping preferences, while those in different groups have different shopping behaviors. It is used to help enterprises develop appropriate strategies in the competitive markets, in order to gain more profit. Previous researches use general attributes for customer segmentation, the general attributes are variables like age, income, marital status and so on. However, the results from such methods are not good. For example, customers with similar general attributes may have different shopping preferences, because the general attributes are static, the results may miss some important trends.This paper presents a new customer segmentation system, in which a new customer clustering algorithm is employed to cluster customers from massive transaction data. Our work includes the following three contributions:(1) A new distance is defined to measure the distance between different customers. First, a p-tree is built for each customer from the transaction data, and then the distance is computed from two p-trees. We compared the p-tree distance with three other distances for transaction data, the experiments results show that the p-tree distance is more suitable than other distances on large-scale transaction data.(2) We used a customer clustering algorithm PurTreeClustering to cluster customers from massive transaction data. First, we convert the customer transaction data to a p-tree set, then build a Cover Tree from the p-tree set. After that, we estimate densities to find out the initial cluster centers, which have high density and are far away from each other. Finally, each p-tree is allocated to the nearest initial cluster center. We compared the PurTreeClustering algorithm with five other clustering algorithms. Experiments results show that new method outperformed other methods on large-scale transaction data.(3) We developed a customer analysis application system for visual analysis of customer relationships. The system is used to help users to interactively analyse customers from massive transaction data. The system includes four modules:First, data preprocessing module, which is a client in browser for user to chose what datasets they want to analyse, and build p-tree sets from transaction data. Second, we use data modeling module to build CoverTree, in which we found the initial cluster centers. After that, data analysis module is used to cluster the customers. Finally, data visualization module, which is used to convert the analysis result to meaningful result, and show user the customer relationship and what customer were interested in. We used a serials of transaction datasets to verify the effectiveness of the system, the analysis result show that some interesting customer groups were observed, and it's useful for users to analyse customers from massive transaction data.
Keywords/Search Tags:Customer Segmentation, Clustering Algorithm, Transaction Data, Similarity Mesure
PDF Full Text Request
Related items