Font Size: a A A

Research Of Parallel DBSCAN Algorithm Based Data-partitioning And QR~*Tree

Posted on:2008-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:H XuFull Text:PDF
GTID:2178360215459807Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the high-speed development of information technology, application scale, range, deepness of database has extended ceaseless, which results in accumulating lots of data. However, these increased data hide much important information; therefore, people expect to analyze it further in order to utilize these data better. Clustering is an important subject in the domain of data mining.DBSCAN algorithm is a density-based special clustering algorithm. According to the concept of density-based clustering, that is to say, it is necessary that the number of being included object in certain domain of clustering space is not less than a certain threshold. The prominent advantage of DBSCAN algorithm is that the speed of clustering is fast; it can deal with outlines effectively and discover special clustering of discretional figure. However, because it operates the whole database at first hand, and it uses a global token density parameter in processing clustering, so it has two obvious questions: the one is requiring large volume of memory support and needing a lot of I/O costs when data is increased; the other is clustering quality is bad when density of spatial clustering is not equality and it is big between clustering distance.Aiming at above problem, a parallel DBSCAN algorithm is presented on the basis of data-partitioning and QR~*tree in this paper. That is to say, according to spatial distribution characteristic, it partitions the whole data space into subareas; Then send every part database to a disposed cell respectively, and constitute QR~*tree on the foundation of every disposed cell, use a DBSCAN Algorithm based QR~*tree to do clustering; At last, combine the gained clustering result in the light of rules.Experimental results show that the new algorithm is effective.
Keywords/Search Tags:clustering, DBSCAN, data-partitioning, QR~*tree, parallel computing
PDF Full Text Request
Related items