Font Size: a A A

Research On Outlier Mining Algorithms Based On Subspace And Its Application

Posted on:2009-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:L Y GeFull Text:PDF
GTID:2178360248954315Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Outlier mining is one of the most important topics in data mining. Outlier mining can help people discover true and unexpected information, and has aroused the interest of the many researchers. Most traditional methods of outlier mining regard outliers from overall point of view, so it's difficult to find bias data or outliers in subspace. This paper studies outliers mining in subspace by partitioning high dimensional space into low dimensional subspace. Main researches are as follows:(1) An outlier mining algorithm (OM-PSO) based on PSO(Particle Swarm Optimization) and subspace is presented. The algorithm regards outlier subspace as particle swarm, and searches for outlier subspaces with mutational PSO algorithm according to sparsity coefficient of subspace. Data in outlier subspace is regarded as outlier. Finally, the experiment results validate the OM-PSO algorithm by taking the star spectra data from the LAMOST project.(2) Local outlier mining algorithm based on subspace partitioning is presented. Firstly, data set is divided into the disjoint subspaces. Merits of partition are measured by skew of partition, and the best partition of the subspaces is searched by using the PSO. Secondly, the local outlier factor (SPLOF) of each data in the best partition is computed, and local outlier is measured by its SPLOF value. Finally, experimental results show that the PSO-SPLOF algorithm doesn't depend on user's parameters, and has scalability and high efficiency by taking spectral data as data set.(3) On the basis of above, the outlier mining system based on subspace is designed and implemented by using VC++ 6.0 and Oracle 9i as development tools. Its function modules and key technology are elaborated.
Keywords/Search Tags:Outlier, Particle Swarm Optimization, Sparsity-coefficient, Skew of partition, Subspace, Star spectra
PDF Full Text Request
Related items