Font Size: a A A

Association Rules Mining Based On The Related Interest Measure

Posted on:2014-12-27Degree:MasterType:Thesis
Country:ChinaCandidate:X X WangFull Text:PDF
GTID:2268330401476396Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
In recent years, data mining technology has aroused great concern in the field ofinformation technology, this is due to the rapid progress of data collection and data storagetechnology made the organization has accumulated huge amounts of data, these data can beconverted into useful information and knowledge are widely used. After the conversion of theinformation and knowledge can be applied to market analysis, market planning, projectplanning and scientific research etc. Data mining in large data storage, automatic discovery ofuseful information. It is a kind of complex algorithm combines traditional method of dataanalysis and processing of large amounts of data combination technology. An important partof the association rules technology as data mining, data mining technology in thedevelopment and prosperity of the case have been booming, and toward more extensive andin-depth direction for further development. Association rule mining is for the purpose of datafrom a large number of discovery and direct interesting association and relationship,association rules from both the theoretical research and practical applications are thedevelopment prospects are very broad; from the narrow shopping basket analysis to thedesign and optimization of the website, and even extended to the traffic pattern analysis orassociation analysis of pharmaceutical ingredients, its theoretical research for many types ofdata mining provides feasibility, for example mining frequent patterns from the mining closedpatterns, subjective interestingness to the mining of other related models. Therefore, therelevant technology in-depth study of association rule is very necessary. In recent years, dueto the discovery of the relationship between data, regular structure and found simple, easy tounderstand, therefore, one of the hot topics in the field of data mining association rulesbecome. In this paper, the advantages and disadvantages of association rules is analyzed, andthe corresponding improvements according to its shortcomings. The main work includes thefollowing aspects.(1)Classic Apriori and does not produce a candidate set FP-Growth algorithm analysis andresearch: mining frequent itemsets with Apriori algorithm, computationally expensive, requiremultiple passes over the database, an increase of CPU overhead. FP-Growth algorithm on theperformance than the Apriori algorithm has greatly improved, it only needs to scan the twodatabases, and to avoid generating a large number of candidate itemsets.The main defect of the FP-Growth algorithm is the space overhead. Thus, this paperintroduces the concept lattice, the extension of the concept lattice for the collection of allobjects belonging to this concept, but the connotation is that a collection of attributes for allthese objects. Described by the concept lattice, in essence, the link between objects andattributes that the relationship between the concepts of generalization and, while the concept of Ha Situ mining of frequent itemsets. Apriori algorithm is simple and intuitive, the resultswere similar.(2) Support the degree of confidence under the framework of association rules there aresome defects, Then, this paper introduces interest measure. First of all, in-depth research andanalysis of several existing association rule interestingness, pointing out these interestingmeasure method of their shortcomings, puts forward a method to measure the interestingnessof association rules based on improved, we prove some properties of the metric method, andthe method is compared with the traditional method, is pointed out. The improved methodmay also said that judging the positive and negative association rules, and not sensitive to notbuy variables, attributes of new empirical methods. The method is compared with the originalmethod has certain advantages.(3) Due to symmetry in solution before and after the problem, the general association rulesexist deficiencies, therefore, through the proposed item-item correlated association rulesmining improves the shortage. At the same time, proposes a mining algorithm:ItemCoMine_AP algorithm, and test the performance of the algorithm, to analyze the effectand the application effects of branch correlation metric. Through theoretical analysis andpractical test, description of association rules is presented to improve the quality of generatingassociation rules, the application effect is better than the ordinary association rules have beenimproved obviously.
Keywords/Search Tags:Data Mining, Association Rules, Concept Lattice, Interest Measure, Pruing Effect
PDF Full Text Request
Related items