Font Size: a A A

Study On Mining And Compressing Basket For Transaction Data

Posted on:2017-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:W W ChuFull Text:PDF
GTID:2349330503981872Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the era of big data, we are constantly producing all kinds of data and also enjoying service and convenience that data brings. Data mining is a process of extracting the potential useful information and knowledge that people do not know from a large number, complete, noisy, fuzzy, random data. During decades of development, the technology of computer, statistics, mathematics science is gradually maturing. And the high performance of the relational database engine and a wide range of data integration are introduced.The data mining technology has entered the practical stage in the data warehouse environment now.The shopping basket analysis is a typical application of data mining technology in the retail industry, which aims to analyze the combination of goods bought by customers at the same time according to the retail records and dig out the value information in the shopping basket. Now the shopping basket analysis has been widely used in the retail industry, including promotional merchandise, swing frame, logistics and so on. Through research and communication with retail customers, it has been found that the traditional shopping basket analysis has many defects and not practical in practical application. Firstly, analysis of the traditional shopping basket is only applied for a concept hierarchy of goods, not for considering the relationship in the same level of goods. But in real life, the probability of customer purchasing the same kind of goods is much higher than that of buying different kinds of goods. For example, customers who buy cabbage are more likely to purchase carrots and other vegetables, which leads that the result is usually a collection of the same commodity from the traditional shopping basket analysis method. Secondly, in the traditional market basket analysis, it is generally found out that the shopping basket is a combination of some conventional commodity just according to the dimensions of supporting degree, which has little value to the enterprise. Finally, traditional shopping basket analysis generally face a problem is the setting of the threshold, setting the threshold too low will generate a huge number of shopping basket, which cover up the real significance of the shopping basket; setting threshold too high will lead to shopping basket number too small and conventional.In this paper, according to the existing problem in the traditional shopping basket analysis, this paper puts forward a new method of shopping basket analysis and completes the following three innovative work:1. According to the transaction data, product hierarchy structure tree will be generated, relationship between commodity information can be got. In the generation of the shopping basket, adding product structure tree into constraint condition make shopping basket not appear products belonging to the same parent class, which increases the diversity of the shopping basket of goods and enhances the practical value of the shopping basket.2. In the traditional shopping basket analysis, adding the sales into the shopping basket assessing dimensions will make shop basket more universality and more high value, which makes the enterprise t more focusing on the high profit goods and gets the increasing corporate profits.3. Aiming at the problem of setting support threshold too small leading to the huge number of shopping baskets, this paper presents a shopping basket compression method. With the difference of previous shopping basket compression method, this paper construct shopping basket with properties of the time series characteristics and find a series of representative shopping basket in a clustering way according to the characteristics of the transaction data and the reality of life in the shopping basket, which realized the compression of the shopping basket collection.
Keywords/Search Tags:Data Mining, Association Rules, Market Basket Analysis, Constraints, Market Basket Compressing
PDF Full Text Request
Related items