Since the late20thcentury, the technology of data acquisition was increasinglydeveloped. The traditional ways of data extraction was changed with the innovation ofthe existing database and the information expansion. The data stream became themainstream in the form of data. Therefore, how to extract the valuable informationquickly and efficiently gradually becomes the hot topic in the field of data mining.For the dynamic nature of data streams, the clustering of data stream must be adynamic execution, and the data can be processed uninterrupted. Secondly, themanifestation of the mining results should be intuitive and simple. In addition, thestream clustering algorithm should show the dynamic evolution process of the datastream and dynamically maintain the results of clustering, reflecting the timeliness ofthe data stream.Traditional data stream clustering algorithm is based on grid clusters at the gridof same granul-arity, it improves processing speed, but the accuracy of cluster is lower.In this connection, a new data stream clustering algorithm DBG-Stream based ondouble-layer grid and density is put forward. The algorithm uses grids of two differentgranularities to cluster data stream, by learning the idea of CluStream algorithm, itdivides the clustering process into two stages. The first one is that applyingcoarse-grained grid cells to form the initial cluster in the online process, and thesecond one is that on the fine-grained grid cells, making secondary clustering for gridcell located on the boundary cluster in the offline process so as to improve theaccuracy of cluster. At the same time, it enables the automatic setting of keyparameters. Besides, it improves the efficiency of the algorithm by the strategy ofdeleting grid. The results of experiments show that the DBG-Stream algorithmclustering accuracy has greatly improved than D-Stream algorithm, it effectivelysolves the problems of traditional grid-based clustering algorithms. The algorithm candiscover clusters of arbitrary shape.And the algorithm is suitable for large-scaledata stream of knowledge mining. |