Font Size: a A A

Research Of Association Analysis Algorithm Based On Weka

Posted on:2016-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:X Y GuoFull Text:PDF
GTID:2308330482468043Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of data collection and data storage technology, we have entered into the era of big data. Association analysis as one of the most active data mining research directions is used to find the meaningful connections hidding in large data sets. It has been widely applied in many fields such as Web mining, document analysis, communication warning analysis, network intrusion detection and bioinformatics. Therefore, correlative analysis technology has certain theoretical significance and practical value.Apriori algorithm is the basic method of mining association rules, and it is also the most classic association rule mining algorithm. However, the basic method will produce a large number of candidate set and requires multiple scans databases. In order to solve these two problems, a method based on coding items ??11?kkFF between different coding items perform "and" operation of thinking is proposed. Then, a novel algorithm which can improve the speed and reduce the cost of mining a database of operating time is given. Finally, a simulation is given, and the results show that the proposed association analysis is effective on the business decision activity.The main work is as follows:1. First, a correlation analysis technology by the theoretical discussion is analyzed. Then, some problems including generating of frequent item sets, pruning techniques to produce a priori principle, rule-based technology, and objective measure of the degree of interest are analyzed. A description on the common correlation analysis algorithms, analysis and comparison of the advantages and disadvantages of the algorithms are given.2. This thesis proposes a ??11?kkFF combined key item coding method between different encoded items optimizing algorithm "and" computation. Premise item is encoded first data set of all binary items that are encoded according to the position which appears in the data set. In the operation process of the items in time "and" to produce more high-end frequent item sets, a ??11?kkFF combined method which merges a frequent item set ?k?)1( if their k?2 items are the same is given. Finally, by comparing the original algorithm, the results show the proposed algorithm has the superiority.3. Finally, a real supermarket shopping data through the Weka platform is mined. Then according to optimize parameter combination, modes excavated with different parameters are analyzed. An association rule from these data is proposed and some useful information from these association rules is obtained. The simulation results show that this useful information has an important guiding role in business decisions.
Keywords/Search Tags:Data Mining, Association Analysis, Apriori algorithm, Item coding, Weka
PDF Full Text Request
Related items