Font Size: a A A

Research On Key Techniques Of Negative Frequent Patterns Mining Based On Multiple Minimum Supports

Posted on:2016-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:T T XuFull Text:PDF
GTID:2308330473966207Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Frequent patterns mining is one of the important research contents in data mining and knowledge discovery, it aims to mine frequent patterns from the database, including frequent itemsets, frequent subsequences (also called sequential patterns) and frequent substructures. As an important supplement to frequent patterns mining, negative frequent patterns mining not only takes occurring events into account, but also considers non-occurring events, it can provide a new perspective for data analysis, as well as in-depth analysis and understanding the potential implications of the data, it plays an irreplaceable role in many applications. In recent years, negative frequent patterns mining has been applied to many fields. But in practical applications, the occurrence frequencies of all items are not the same, so mining negative frequent patterns only based on the single minimum support may not correctly reflect the characteristics of the object. In view of this, each item in the database can have a minimum item support (MIS) which specified by the user, namely multiple minimum supports. This paper uses negative frequent patterns mining algorithms and frequent patterns mining algorithms with multiple minimum supports (MMS) to mine negative frequent patterns with MMS, specific contents are as follows:1. Mining frequent itemsets with MMSIn this paper, we propose an algorithm of mining frequent itemsets (FIS) with MMS, named MSB_apriori, which uses the classic Apriori algorithm and the minimum MIS value of all items as support threshold to obtain itemsets first, and then it chooses all FIS that satisfy itself MIS value. Because this algorithm is time-consuming, we also propose an optimized approach, named MSB_apriori+. The main difference is the sorting of the items in ascending order of their MIS values first. The items in each itemset also follow this order in all subsequent operations of the algorithm. So it can greatly reduce the number of candidate itemsets. Compared with MSapriori algorithm, these two algorithms are easier to understand and are suitable for changing the threshold suddenly.2. Mining negative frequent itemsets with MMSIn view of the existing algorithms of mining negative frequent itemsets (NFIS) mostly used single minimum support, in this paper, we propose an efficient algorithm of mining NFIS with MMS, named E-msNFIS. This method uses MSapriori method to mine FIS with MMS, then it generates negative candidate itemsets (NCIS) based on these FIS, and then gets all NFIS that satisfy itself MIS value. We also propose the method of setting negative itemsets’ MIS. Compared with E-NFIS algorithm which uses single minimum support, this algorithm can mine more valuable negative frequent itemsets and provide more information for the users making decision.3. Mining negative sequential patterns with MMSIn view of the existing algorithms of mining negative sequential patterns (NSP) mostly use single minimum support, in this paper, we propose an efficient algorithm of mining NSP with MMS, named E-msNSP. This method uses MS-GSP method to mine all positive sequential patterns (PSPs) with MMS, then it generates negative sequential candidates (NSCs) based on these PSPs, and then gets all NSPs that satisfy itself MIS value. We also propose the method of setting negative element and negative sequence’s MIS. Through the experiment, we compare PSP and NSP which are based on PSP in running time and the number of NSP, the results show that this method can mine lots of NSPs with less time, which proved its efficiency.4. Mining high utility sequential patterns with negative item profitsCompared with support, Utility (profit) can reflect more commercial value in many practical applications. The traditional high utility sequential patterns (HUSP) mining algorithms only consider the items’positive profits, not to mention the negative profits, while negative profits are also important in many practical applications. So in this paper, we add negative item profits to the sequential patterns mining and proposed a novel method, named HUSPNIV, which can mine high utility sequential patterns with negative item profits. Each item is associated with a quality (such as unit profit) and quantity (such as purchased number). This method adapts the Lexicographic Q-sequence Tree (LQS-Tree) to construct and organizes utility-based q-sequences, and uses I-Concatenation and S-Concatenation to generate the children’s utility based on the utility of its parent, and proposes three pruning approaches, and then obtains all HUSPs based on minimum utility threshold. It can provide more information for the users making decision.
Keywords/Search Tags:multiple minimum supports, frequent itemsets, negative frequent itemsets, negative sequential patterns, high utility, sequential patterns
PDF Full Text Request
Related items