Font Size: a A A

Algorithm Research And Application Of Association Rules In Data Mining

Posted on:2005-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q ZhangFull Text:PDF
GTID:2168360152966516Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Data Mining or Knowledge Discovery emergeing in the late 1980s has become a hotspot in the fields of artificial intelligence and database technology. Data mining has its wide application prospect and is expected to continue to flourish in the new millennium. R. Agrawal etc. first put forward the issue of mining association rules in 1993. Now it has been a significant content of data mining and so draws much attention of many researchers.The typical algorithm of association rule is Apriori that is put forward by R.Agrawal. However, in calculating the support of candidate itemsets, the algorithm need to scan the whole databases of circulations Apriori. But as the increases of K, not only the number of K-itemsets reduce but also the things which include these itemsets become few. Because data become larger and renew frequently, it is rather difficult to design effective data mining algorithms. In addition, most of the algorithms must scan the whole large database when new data are added to it. Moreover, the itemsets which includes new items will be often regarded as unfrequent itemsets even if they happened frequently in new data set because the support of the itemsets is calculated based on the whole database. So the association rules come from above frequent item sets can't reflect the recent business activities.Based on the research backgrounds of data mining and the problems of existing association rule algorithms, this dissertation have finished the following work:1. Analyse the current data mining techniques. Based on the basic concepts of data mining, this dissertation compares and analyses the differences of data mining and other methods such as KDD and OLAP, classifies and summarizes the objects of data mining, the findable patterns and the common techniques in detail.2. Analyse the current data mining technique of association rules. Based on the basic concepts of the association rules, this dissertation classifies and summarizes its species roundly and summarizes, analyses and studies its typical mining algorithms and these algorithms' basic ideas in detail. In succession, the differences among these algorithms are compared objectively. All kinds of optimized techniques which aredesigned to promote the algorithm's efficiency are also studied and discussed in detail here and at the same time their merits and defects are analysed objectively.3. To the deficiency of Apriori algorithm, this dissertation brings forward a high-efficient algorithm for mining association rule. The new algorithm can filter and delete records of the database with the support of candidate itesmsets. Because the database scaned for the support of itemsets is smaller the its original, the efficiency of whole algorithm can be improved. At the same time, a new method of generation of candidate Ck is introduced which can avoid many scans for itemsets Lk-1 while generating Ck, thus improve the whole efficiency.4. To solve the problem that the existing algorithm can not find new items in new data set, this dissertation brings forward a new concept-sensitivity to measure how much the algorithms thinks of the new items which appeared in the new data set. In succession, this dissertation improves the original Incremental Updating Algorithm from sensitivity and time efficiency, and the two algorithms are also compared and analysed with an example.5. At the end, this dissertation analyses the Customer Relationship Management roundly, discusses the application of data mining and association rules in CRM based on data warehouse.
Keywords/Search Tags:Data Mining, Association Rules, Frequent itemset, Data Warehouse, CRM
PDF Full Text Request
Related items