Font Size: a A A

Research On Technology For Detecting Density-based Outlier

Posted on:2008-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:N LiFull Text:PDF
GTID:2178360272469824Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Implementing data mining and business intelligence from the data integration is the ultimate goal of data integration and the best performance of enhancing the value of data. Data mining and data warehouse techniques were developed so fast in the last ten years. Almost every month, a new tool is on sale. At the same time, the emergence of electronic commerce, the thirsty need of network security, the high frequency of the network intrusion and fraud in the area of finance and communication make a new task in data mining technique, outlier analyzing, become attention-getting.Outlier detection and analysis is an important data-mining task, the study of methods for outlier detection is of great significance in real world. The method that identifies outliers based on density has greater advantages. However, the performance of index structures proposed in the method is not good enough when conducting k nearest neighbors'search, which leads to a not ideal detection performance when dealing with medium or high dimensional data. Based on the analysis of the technology of density-based outlier detection, a new index structure is introduced to improve the performance of k nearest neighbors inquiries, and on this basis, the density-based outlier detection system for medium or high dimensional data is constructed.The following work has been done: As to the problem of optimization of K nearest neighbors search, a new index structure is introduced according to the application background. The new structure integrates the merits of the two existing important types of index structures, X-Tree and VA-File, to improve the performance of K nearest neighbors search, and the corresponding algorithm is given. With regards to the outlier detection, the density-based outlier detection system, VAXLOF, is designed and constructed to detect the outliers in medium or high dimensional data. Through the analysis of experimental results, the performance of K nearest neighbors search has been improved; as a result, VAXLOF system gets an enhanced performance and also reached an effective capacity to detect outliers.
Keywords/Search Tags:Data mining, Density-based outlier detection, Outlier, K nearest neighbors search
PDF Full Text Request
Related items