Font Size: a A A

Research On Frequent Itemsets Mining Algorithm Based On Matrix

Posted on:2008-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:H B WangFull Text:PDF
GTID:2178360215457151Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Association rules is an important branch of data mining. It reflects the inner relationship of a mass of data. Its goal is to find all strong association rules satisfying minimal support and minimal confidence. Frequent itemsets mining is not only the primary step of association rules mining, but also a hot and difficult problem of data mining. It is obvious that frequent itemsets mining is a research subject of important theory meaning and of wide outlook for application.Research on frequent itemsets mining algorithm. In this paper, based on the brief study of association rule mining, we take an in-depth study on typical frequent itemsets mining algorithms, such as Apriori and FP-Growth. Then we pay attention to some improved frequent itemsets mining algorithm, introduce latest extended problem of frequent itemsets mining.Improving frequent itemsets mining algorithm based on matrix. This paper puts forward an improved algorithm based on matrix for mining frequent itemsets. The algorithm absorbs the logic of classical algorithms, imports a new data structure: IMoFI. The algorithm uses indirect addressing index technique similar to pointer to make bitmap matrix filled with frequent items be inner coded. The elements of matrix IMoFI are made to describe two meanings: occurrence of some frequent item in a certain transaction and the order of transaction in which some frequent item next appears. With AV being used, the algorithm makes candidate sets void appearing repeatedly, compresses the storage cost of IMoFI. By these improvement above, the algorithm provides a very effectual method to rapidly search frequent itemsets, consequently greatly enhances efficiency for mining frequent itemsets.This algorithm brought out in the paper is coded via C# language and runs on .NET Framework. After comparing the algorithm with classical algorithms, we find predominant performance in short pattern, sums up the reasons for upgrading the mining performance.
Keywords/Search Tags:Data Mining, Association Rules, Frequent Itemsets Mining
PDF Full Text Request
Related items