Research On Association Rule Mining Algorithm Based On Time-stamp And Vertical Format

Posted on:2020-10-20

Degree:Master

Type:Thesis

Country:China

Candidate:J J Ma

Full Text:PDF

GTID:2428330602986952

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Due to the development of computer technology and the popularity of the Internet,data plays an increasingly important role in life,social production and scientific inquiry.Obtaining effective information from massive data can help us make correct decisions.The task of data mining is to mine effective information in the data.This paper studies the popular association rule algorithm in data mining,which aims at mining the hidden association between data.The proposed algorithm in this paper is an improved version of the specific latermarketed consequent mining algorithm(SLMCM).The later-marketed commodities can be extended to the new items added to the database.This algorithm allows for data updates and adapts to practical applications.The key step of SLMCM algorithm is to add time-stamp,so it is also called times-tamp based association rule algorithm.The execution efficiency of SLMCM algorithm is extremely low,which is not suitable for the current big data background.In view of this problem,this paper proposes the following improvements:(1)Proposed an improved algorithm E-SLMCM,which adopts a vertical structure and only needs one traversal of the database to converts the database into a vertical format.Because when converting the database to a vertical format,the time-stamp of each item can be recorded directly according to the time when the item first appeared and does not need to sort the items of each transaction by time-stamp as the original algorithm,which saves time.In addition,the algorithm adopts the method of set enumeration tree ascending,and the efficiency is doubled.(2)In order to improve the operation efficiency on the dense database,the DESLMCM algorithm is proposed based on the E-SLMCM algorithm with the Diffset.In this paper,the method of descending order of set enumeration tree is adopted.(3)For adapting to big data mining,E-SLMCM algorithm and DE-SLMCM algorithm proposed in this paper were parallelized,and SPE-SLMCM algorithm and SPDE-SLMCM algorithm were proposed based on the popular Spark distributed framework.Since the algorithm adopts a vertical structure,when generating candidate item sets,the combination of item-sets with different prefixes is separate and does not affect each other.Therefore,multithreading can be conveniently used for the distributed processing of the algorithm,which is more suitable for big data background.

Keywords/Search Tags:

data mining, Association rules, The time stamp, Vertical structure, Parallel Mining

PDF Full Text Request

Related items

1	Study On Association Rules Mining Based On Time Stamp
2	Research Of Association Rules Mining Based On Vertical Data
3	Research Of Paralleled Frequent Subgragh Mining Algorithm PG-Miner Based On Claster Environment
4	A Study On Association Rules Mining Algorithm And Its Application On Web Mining
5	Design Of Frequent Pattern Mining Algorithm LPS-Miner And Research On Parallel Formulations
6	The Research On Algorithm For Association Rules Mining Based On Vertical Data Presentation
7	Research On Association Rules Mining In Data Streams And Its Application
8	Research On Theory And Algorithms For Mining Association Rules
9	The Research Of Parallel Association Rules Mining Algorithms Based On Cloud Platform
10	Study On Parallel For Association Rules Mining