Font Size: a A A

A Study On Local Outliers Mining Algorithm Based On Weighted-Attribute

Posted on:2011-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:L Z MaFull Text:PDF
GTID:2178360305965514Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Local outliers are the data objects which are different to other data objects within its neighborhood. High-dimensional outlier mining is an important branch in the field of data mining. Outlier detection is now widely used in field of credit card fraud, e-commences criminal behavior detection, and network intrusion analysis, etc.Coupled with the development of information technology and the improvement of the precision of data collect equipments, we have got more data items in larger volume and high-dimensional. The outlier detection algorithms existing today are hardly to work with the data object features that the author mentioned above, they primary conduct the single or low-dimensional data objects. Meanwhile, due to the complexity and diversity of reality, we can not get integrated data, though we can get data in large volume. Thus, what we get is just a local dataset, and in many cases, users often concerning local data instability.The paper studies on the method of attributes partition, attributes reduction and attributes weight setting according to the data objects attributes' features. The author proposed local outliers mining algorithm based on weighted-attribute. The main contribution of the paper is as follows:1. The author gives an attributes partition method. In general local outlier detection algorithm, all attributes involved in the process of neighborhood computation, which will take much time to complete. Nevertheless, outlier measurement accuracy will be affected negatively since all attributes involved in computation indiscriminate. The paper partition data objects attributes into feature attributes and environmental attributes according to attributes' characters. Feature attributes decide the fundamental characters of data objects and use feature attributes to comparing data object to its neighborhood. Environmental attributes decide the position that data objects exist in dataset and use environmental attributes to define the neighborhood of data objects. 2. To achieve the goal of dimensionality reduction, those attributes can not discriminate data objects were deduct under high-dimensional data space.3. Data objects' attributes reflect different degree of importance under different environment. By weighted the attributes, which can show the important attributes effectively and make it easy to explain the outliers detected.
Keywords/Search Tags:Data mining, Attribute partition, Rough set, Attribute reduction
PDF Full Text Request
Related items