Font Size: a A A

Distributed Grid Environment Outlier Mining Method

Posted on:2011-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:B HanFull Text:PDF
GTID:2208360308971880Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Outlier Mining ,which is one of the most important research topics in data mining, can discover objects which are inconsistent with other datas′behavior in data sets , and is applied in many fields, for example, internet instrusion,credit cards cheat, weather forecast etc. However, with the appearance of distributed, heterogeneous, mass data sets, using centralized method cannot meet the requirement of practical application. For horizontal and vertical divisions of data set, Outlier Mining algorithms are studied by using grid as a distributed computing in this article. Main research work is as follows:(1) A distributed outlier mining algorithm based on grid is presented .For vertical divisions of data set, local outlier factor is used to compute local distance between data objects on every sub-node at first. Secondly, MinPts distance neighborhoods between data objects are generated by using grid service, and the KNN search are completed.Thirdly, the neighborhoods are transmitted to main node, and LOF value computing is completed. Finally, experiment results validate the effectiveness of the algorithm by using KDD CUP99 data set,.(2) A distributed algorithm based on PSO and subspace is presentd. For horizontal division of data set, local sparsity subspaces are computed based on OM-PSO at first, then the local sparsity subspaces are transmitted to main node. On main node, the same local sparsity subspaces are merged and the global subspaces are generate by using grid service. Finally, justice the global subspaces are by computing fitness value of global subspaces and determine outliers.Experiment results validate the effectiveness of the algorithm by using star spectra data from the LAMOST project.
Keywords/Search Tags:Outlier, Grid, LOF, KNN, OM-PSO, Sparsity subspace
PDF Full Text Request
Related items