Font Size: a A A

Research On Stream Clustering Evolutionary Algorithm Based On Kernel Method

Posted on:2019-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:M R SunFull Text:PDF
GTID:2428330572452534Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Stream data analysis is one of the key topics in the field of data mining,and the techniques used include classification,frequent item mining,clustering,estimation,prediction,relevance grouping,and association rules.One of the classic algorithms in stream clustering CluStream,the online-offline processing framework can analyze stream data information online and cluster according to the user's needs offline to output data information in the corresponding period.however,the efficiency of the algorithm in dividing high-dimensional data is low.Based on this,a kernel-based stream clustering evolution algorithm is proposed.Firstly,for the problem that the stream data has a variety of dimensions and is difficult to divide linearly in the original space,the kernel method is introduced,and the data is projected into the kernel matrix of the high-dimensional feature space through nonlinear projecting rules,so that it can be linearly divided effectively and the dimension disaster problem in data processing is avoided.However,due to the infinity of stream data,the construction of the kernel matrix will lead to a large amount of computation and a large memory.This paper uses the differential sampling method based on statistical leverage scores to obtain online sample sets similar to the data distribution in the entire stream data to construct a sample kernel matrix.The method has a good effect in relieving memory pressure and reducing the algorithm time complexity while providing real-time processing performance.Then,through the cyclic clustering of the data in the sample kernel matrix,obtain the summary data model in the stream data used to divide the new data.Finally,for the updating of the evolution process of stream data,this algorithm is inspired by the time fading factor introduced in the previous stream clustering algorithm.The fading mechanism is constructed by using the principle of time fading function to update and divide the data in real time and reflect the evolution process of stream data.Experiments on random and UCI data sets show that the proposed algorithm has good clustering performance and it has better stability for clustering performance in different dimensional data sets.
Keywords/Search Tags:stream data, clustering, kernel method, statistical leverage score, time fading function
PDF Full Text Request
Related items