Font Size: a A A

Research On Sketch-Based Data Streams Mining Of Frequent Itemsets

Posted on:2013-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:F F DouFull Text:PDF
GTID:2248330395955726Subject:Computer applications
Abstract/Summary:PDF Full Text Request
A growing number of applications appeared with the development of scientifictechnology, where data takes form of “continuous data streams”, which means dataarrives continuously, orderly and in real time. The mining of frequent items (itemsets)plays a key role in the mining of data streams, while the traditional static frequent itemsmining algorithm does not apply to the data stream. With the technology of hash, Sketchcan process the data at a faster rate. Sketchsave more data’s approximation with limitedmemory, which meets the characteristics of data streams. Therefore, Sketch is widelyused in the mining of data streams frequent items. However, the traditional Sketch notonly does not apply to multi-items, but also the performance is not perfect in the miningof frequent items. The paper’s research can be summarized as the followings:Because the previous Sketch has more memory consumption, saturated with theincreasing of data, low query accuracy and other issues, this paper presents severaloptimization techniques for Sketch and proposed a new summary of data structure,which is ECM(extensible Count-min) and solve these problems effective.The traditional mining algorithm of frequent items does not apply to the uncertaindata stream and Sketch can only find application in frequent1-items. Combine thecharacteristics of Sketch and uncertain data streams, a new mining algorithm ofuncertain data frequent items, UF-ECM, is proposed. UF-ECM not only solves theproblem that Sketch can only mining the1-items, but also plays well in frequentmulti-items of uncertain data streams and the experiments prove this point.
Keywords/Search Tags:Sketch, Frequent Items, Data Streams, Uncertain Data Stream
PDF Full Text Request
Related items