Font Size: a A A

Research On Data Stream Clustering And Its Applications Based On Correlations

Posted on:2008-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:J C ShiFull Text:PDF
GTID:2178360215451586Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Since the end of last century, data stream techniques have been advanced to meet the requirements of network monitoring, inbreak detecting, information analyzing, business transaction management and analyzing, etc. Input data of the streaming data is "streaming" of continual and orderly, data streams have character of effectiveness for time, real time, immensity and instantaneous, etc. Representative examples are network click flow, monitoring data streams, stock data streams and selling data streams of supermarkets, etc. Analyses to data streams include classification, clustering and frequent pattern mining. Therefore, some new technologies and methods are used to analyse data streams, such as sliding windows, one-pass algorithms and so on. Based on introduction to data streams and key algorithms of data stream mining, an algorithm of correlations between commodities is proposed, and then a data stream clustering algorithm based on correlations is proposed to cluster commodities.The research contents of this dissertation are below:(1) Introduction to data mining, data streams and key technology and algorithms of data stream mining, including data stream classification algorithms VFDT and CVFDT, clustering algorithms STREAM and CluStream and frequent pattern mining algorithms FP-Stream ,etc.(2) An algorithm of correlations based on data streams is proposed to process correlations between commodities in supermarkets. The algorithm can calculate the correlations between commodities quantitatively. It can do this in limited time and memory and process the endless data streams based on the methods of data streams. The experiment shows that the algorithm can measure correlations between commodities in supermarkets effectively based on small cost.(3) A data stream clustering algorithm based on correlations is proposed to process the problem of clustering commodities in supermarkets. It can cluster c-ommodities based on their correlations which have been calculated. The algorithm is dynamic. The experiment shows that the algorithm can cluster commodities effectively, and most valuable results are acquired in the experiment.
Keywords/Search Tags:data stream, data stream mining, correlation, clustering
PDF Full Text Request
Related items