A Study On Weighted Frequent Pattern Mining Algorithms

Posted on:2019-07-22

Degree:Master

Type:Thesis

Country:China

Candidate:Q Xu

Full Text:PDF

GTID:2428330596958566

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of information technology,frequent pattern mining as an important technology in data mining has become the focus of people's research.However,as the application scenario becomes more and more complex,frequent pattern mining has been unable to meet the needs of practical applications,so the weighted frequent pattern mining has gradually attracted people's attention.Weighted frequent pattern mining emphasizes the importance of differences between items and mines frequent patterns that users are more concerned about.In the research field of weighted frequent pattern mining,this paper proposes two improved algorithms based on the existing representative algorithms,and evaluates them separately.The research findings are as follows:(1)In the algorithms of preset item weights,an improved algorithm for weighted frequent itemsets mining-Interval Byte Segment Subtraction Algorithm(IBSS_FWI)is proposed to improve the efficiency of the IWS algorithm on dense data sets.When mining a weighted frequent itemset on a dense dataset,in the existing representative algorithms,IWS has low operational efficiency and WIT-diff has high memory requirement.Therefore,this paper proposes an IBSS_FWI algorithm for this problem.IBSS_FWI proposes an interval byte segment subtraction structure(IBSS),which allows the algorithm to have both the advantages of bit vector and diffset strategy.Then,a fast calculation method for the difference between two IBSSs is proposed,and a method for calculating the weighted support of itemsets by IBSS is given.Finally,weighted frequent itemsets are mined from the weighted item transaction database by generating IBSS-tree.In this paper,IBSS_FWI is compared with the existing two weighted frequent pattern mining algorithms IWS and WIT-diff on open datasets.Experimental results show that IBSS_FWI is superior to WIT-diff and IWS algorithms in terms of operational efficiency and memory usage on dense datasets.(2)In the weighted-item-number algorithms,an improved frequent pageset mining algorithm using dwelling time to asign weight called filtering and testing algorithm(FTA)is proposed to improve the efficiency of existing TWTA algorithm.For the existing TWTA and WT algorithm,there are a large number of candidate sets,and the operation efficiency is low.So the FTA proposes the basic principle of filtering page sets that are unlikely to be frequent,and taps three steps of preprocessing,filtering,and testing to quickly mine weighted frequent page sets.In the filtering step,two filtering schemes are proposed to improve the efficiency of the filtering process.They are the filtering algorithm based on advanced Apriori(FAA)and the WPS-tree based filtering algorithm(FWPS).This paper compares the two implementations of FTA,FTA_FAA and FTA_FWPS,with the existing TWTA and WT algorithms on open data sets.Experimental results show that the operational efficiency of FTA_FAA and FTA_FWPS is significantly higher than that of TWTA and WT algorithms.Among them,the efficiency of FTA_FWPS is slightly higher than that of FTA_FAA,and FTA_FAA has greater advantage in terms of memory occupancy than FTA_FWPS.

Keywords/Search Tags:

Data Mining, Frequent Pattern Mining, Weight Frequent Pattern Mining, Web Log Mining

PDF Full Text Request

Related items

1	A Study On Weighted Frequent Pattern Mining Algorithms
2	The Research And Relization Of Mining Frequent Patterns On Business Data Straems
3	The Research On The Related Problems Of Association Rule Mining
4	A Study On Algorithms Of Weighted Frequent Pattern Mining
5	Research On Website Optimization Strategy Based On Frequent Pattern Mining
6	Study On Frequent Subtree Mining And Its Application In XML Mining
7	Constraint-Based Frequent Pattern Mining:Novel Applications And New Techniques
8	Research On The Mining Algorithm Based On Data Streams
9	Study On Probabilistic Frequent Pattern Mining Over Uncertain Data Stream
10	Research And Application Of Frequent Pattern Mining Algorithm Based On Tissue-like P System