Font Size: a A A

Data Stream Clustering Algorithm Based On Density And Fractal Dimension

Posted on:2013-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:J Y JinFull Text:PDF
GTID:2248330377460384Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid development and the wide application ofinformation technology, various applications generate a large number of streamingdata. Such data is a kind of continuous, ordered, changing fast and massive data.Clustering is an important data mining method. However, the traditional clusteringalgorithms cannot be applied to data stream directly. The scholar has done a lot ofresearch work on data stream clustering; however, there are many problems need tobe researched and resolved.Fractal Geometry is developing fast in recent years, it also has been widelyused in some areas, such as geography, transportation, meteorology, and so on.Fractal data mining uses the fractal characteristic to mining the data set, fractalcharacteristic refers to the similarity of structure or feature between the part andwhole. Fractal dimension is an important indicator of the fractal characteristic ofdata set, it can describe the data set effectively. It indicates some characteristics ofdata set have changed when the fractal dimension changed, such as trend,distribution, and so on.In the thesis, some classical algorithms for clustering data stream and FractalTheory have been systematically studied and comprehensively summarized, at thesame time, considering deficiencies of some popular data stream clusteringalgorithms. On the basic of previous research,a data stream clustering algorithmbased on density and fractal dimension is presented. It consists of two phases ofonline and offline processing, combined with the advantages of density clusteringand fractal clustering, thus the deficiency of the traditional clustering algorithm isovercome. In the algorithm, a density decaying strategy to reflect the timelines ofdata stream is adopted. Experiments show that the algorithm improves theefficiency and accuracy of data stream clustering, and can find arbitrary shapes andnon-neighboring clusters.
Keywords/Search Tags:Data Stream, cluster, fractal dimension, grid
PDF Full Text Request
Related items