Font Size: a A A

Research And Application Of Association Rule Extraction Method For Incremental Data

Posted on:2021-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:W T LiuFull Text:PDF
GTID:2428330632954258Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous improvement of information technology and the steady development of various industries,the amount of data accumulated in each industry is growing rapidly,and the real-time requirements for data processing are becoming higher and higher.Choosing a reliable and efficient method to store and process continuously updated data is very important and urgent.Association rules are an important part of data mining.The study of association rules for incremental data has important theoretical value and broad application prospect.In this thesis the problem of extracting association rules from incremental data is studied and the characteristics of existing association rules mining algorithms for incremental data are analyzed.The incremental data association rules extraction method based on buffer technology is proposed and transplanted to Spark environment.The definition of implication strength is studied and used as one of the criteria to judge whether the extracted rules are association rules.The specific research contents mainly include:Research on the extraction method of incremental data association rules,analyze and summarize the classic algorithms in the field and the processing methods for dynamic data.Because incremental data has high real-time performance and is constantly updated with time,the amount of data is often very large,so the data can only be scanned a limited number of times.Using the traditional FP-Growth algorithm for incremental data mining,you need to scan the database multiple times,which greatly reduces the efficiency of the algorithm.Although its improved algorithm CAN avoids multiple scans of the transaction database and construction work,but when the data increment changes,the node count in the tree still needs to be updated multiple times and accompanied by data loss.Aiming at the above problems,an incremental data association rule extraction method AB-Tree based on buffer technology is proposed,which uses buffer technology to process data in batches,reduce frequent scanning of trees,reduce the occurrence of data loss and transplant the algorithm to implement parallel computing in the Spark environment to effectively improve the mining efficiency of association rules.In the original process of association rule extraction,using the minimum support threshold to judge whether it is the result of mining,and then add the concept of implicative intensity.The rules extracted by a single judgment method will have a negative correlation phenomenon that is the occurrence of one event will reduce the possibility of another event.The introduction of the concept of implication intensity can improve the independence of the rule and the accuracy and availability of the mining results will also be improved.An incremental data association rule extraction algorithm based on buffer is applied to real life.First,preprocess the data in the transaction database,arrange the items in the transaction in the specified order,remove the noise data and error data,and then extract the association rules.Analysis of the extracted results can make a better sales model for merchants to increase sales volume.Using the method of contrast experiment to prove the algorithm is true and effective.
Keywords/Search Tags:association rules, incremental data, extract, CAN tree, the buffer
PDF Full Text Request
Related items