A Study On Local Outliers Mining Algorithm Based On Weighted-Attribute

Posted on:2011-03-10

Degree:Master

Type:Thesis

Country:China

Candidate:L Z Ma

Full Text:PDF

GTID:2178360305965514

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Local outliers are the data objects which are different to other data objects within its neighborhood. High-dimensional outlier mining is an important branch in the field of data mining. Outlier detection is now widely used in field of credit card fraud, e-commences criminal behavior detection, and network intrusion analysis, etc.Coupled with the development of information technology and the improvement of the precision of data collect equipments, we have got more data items in larger volume and high-dimensional. The outlier detection algorithms existing today are hardly to work with the data object features that the author mentioned above, they primary conduct the single or low-dimensional data objects. Meanwhile, due to the complexity and diversity of reality, we can not get integrated data, though we can get data in large volume. Thus, what we get is just a local dataset, and in many cases, users often concerning local data instability.The paper studies on the method of attributes partition, attributes reduction and attributes weight setting according to the data objects attributes' features. The author proposed local outliers mining algorithm based on weighted-attribute. The main contribution of the paper is as follows:1. The author gives an attributes partition method. In general local outlier detection algorithm, all attributes involved in the process of neighborhood computation, which will take much time to complete. Nevertheless, outlier measurement accuracy will be affected negatively since all attributes involved in computation indiscriminate. The paper partition data objects attributes into feature attributes and environmental attributes according to attributes' characters. Feature attributes decide the fundamental characters of data objects and use feature attributes to comparing data object to its neighborhood. Environmental attributes decide the position that data objects exist in dataset and use environmental attributes to define the neighborhood of data objects. 2. To achieve the goal of dimensionality reduction, those attributes can not discriminate data objects were deduct under high-dimensional data space.3. Data objects' attributes reflect different degree of importance under different environment. By weighted the attributes, which can show the important attributes effectively and make it easy to explain the outliers detected.

Keywords/Search Tags:

Data mining, Attribute partition, Rough set, Attribute reduction

PDF Full Text Request

Related items

1	Data Mining Research Of Vehicle Sales Based On Hash Quick Attribute Reduction Algorithm
2	Research Of Attribute Reduction Algorithms In Information System
3	Research On Heuristic Attribute Reduction Algorithm Based On Rough Set
4	Research Of Higher Vocational Student's Non-intellectual Factors Based On Rough Set
5	Based On Rough Set Theory Data Mining Technology And Its Application Of Potential Consumers Of Private Cars
6	Association Rule Mining Algorithm Based On Rough Set
7	Rough Set Data Mining Approach And Its Application Relative To Decision Problem
8	Based On Rough Set Attribute Reduction Algorithm Of Data Mining To Improve Research
9	Research On The Attribute Reduction Algorithm Based On Rough Set In Data Mining
10	Algorithm And Implementation Of Data Mining Based On Rough Set