Font Size: a A A

Research On Algorithms For Mining Frequent Itemsets In Uncertain Data Streams

Posted on:2021-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:M Y XieFull Text:PDF
GTID:2438330602498349Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the current data,more and more data exist in form of data flow,such as financial transaction information data,web data,weather monitoring data,e-commerce shopping data,and various sensor data.In this enormous data streams,a considerable part of the data is missing due to data source updates,environmental noise,data duplication or conflicts,and transmission equipment failures,which results in called uncertain data.Due to the characteristics of both data flow and uncertainty,the existing mining algorithms for data stream or uncertain databases cannot be directly applied.Therefore,it is extremely necessary to design an efficient frequent pattern mining algorithm for uncertain data streams.Based on the existing frequent pattern mining algorithms and the application environment of uncertain data flow,this paper has carried out the following work:(1)UFS-mine—An uncertain data stream frequent itemset mining algorithm based on list structure,almost all the existing mainstream algorithms in this field store the pattern information in a tree structure,only nodes with the same item and the same probability can share branch paths,which results in the formation of a large number of redundant nodes and consumes memory greatly.During the mining process,the entire structure tree is frequently traversed,resulting in huge time overhead.In order to solve these problems,under the condition of relying on the sliding window model,the uncertain data stream mining algorithm UFS-mine based on the list storage structure is proposed in this paper,this algorithm stores informations of all the unique data item in the list,and each item corresponds to its own probability and the identification of existing transactions,avoiding redundant nodes and save memory.The expected calculation for the corresponding patterns are also faster,thereby greatly improving the performance of the algorithm.(2)A weighted damping uncertain data stream mining algorithm DWUFS-mine.In most processes of mining uncertain frequent patterns,it is usually simply to multiply the probability of the elements contained in the pattern to calculate the expectation,without considering the weight of different elements.At the same time,the value of the data will also decrease gradually over time,fresh data is more valuable for reference and research than old data.Therefore,on the basis of the UFS-mine algorithm,this paper proposes a weighted damping uncertain data stream mining algorithm DWUFS-mine.The algorithm makes it possible to take into account both data uncertainty and weight attributes,and expectation of old data will be reduced according to a preset attenuation factor.Experiments show that the algorithm can be effectively applied to application scenarios that focus on data weights and are more sensitive to fresh information.
Keywords/Search Tags:Data Mining, Uncertain Data, Frequent Pattern, Data Stream
PDF Full Text Request
Related items