Research On Outlier Mining And Intensional Knowledge Discovery

Posted on:2009-09-05

Degree:Master

Type:Thesis

Country:China

Candidate:F N Lian

Full Text:PDF

GTID:2178360272490105

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Data are considered as a kind of most valuable resource in information society today. Lots of useful knowledge is hidden in complex datasets, discovering and using such knowledge have become the preconditions of scientific decision. Data mining extract the potential useful information and knowledge which is hidden and prior ignorant from large, uncompleted and noisy datasets by means of association rules mining, clustering and classifying.Outlier mining is one of important technology in data mining. Outliers are observations that lie an abnormal distance from the others and do not satisfy the common patterns or actions. They are always doubted generated by another way. Outliers are not considered as wrong data, some outliers maybe contain important information, such as fraudulent behavior, intrusion activity, unusual consuming behavior and so on. So, it is very significant to research outliers.Outlier mining can be broken up into 3 parts:①What kind of observation is considered as an outlier?②How to find out outliers effectively?③Why the outliers are exceptional, which we call intensional knowledge. At present, most of outlier mining algorithms just focus on the identification of outliers. They all fail to provide the reasons for why an outlier is considered exceptional, which is also important to the users and the purpose of outlier mining.An association space-based outlier mining algorithm is proposed in this paper. It finds out the smallest attribute set which causes an observation to be exceptional, and provide its intensional knowledge—it is these attributes that cause the observation outlier. Specifically speaking, the research here mainly includes following aspects:①Several key notions and technologies of data mining are researched, including the application and classify of data mining, data pretreatment, clustering, and association rules.②Good points and bad points of k-means algorithm are discussed, and several initialization methods are studied. Finally, a novel initialization method is proposed.③The theories and methods of distance-based outlier mining are analyzed and summarized roundly. A sum-of-k-nearest neighbor-based outlier mining algorithm is designed, and a partition-based algorithm is introduced.④The FindNonTrivialOuts algorithm is investigated, and an association space-based outlier mining algorithm is proposed, which is verified through experiment study.

Keywords/Search Tags:

Outlier Mining, Intensional Knowledge, Association Space

PDF Full Text Request

Related items

1	Study On An Analysis Method For Cluster-based Outlier
2	Research On Outlier Mining Based On Association Rules
3	Research On Association Rules Mining And Outlier Analysis And Their Application In Analytical Customer Relationship Management
4	Outlier Mining Method Based On Gini Indexes And Sub-space Research
5	Research On Outlier Detection And Searching Algorithm For Outlying Paraphrase Space
6	Research Of Outlier Mining Algorithms Based On Space Partitioning In High-dimension
7	Research On Outlier Data Mining In High Dimensional Space
8	Study On Space Partitioning-based Optimized Clustering Algorithms And Related Techniques
9	Based On Information Entropy And The Subspace Outlier Mining Algorithm
10	Research And Application Of Outlier Mining And Finding Intentional Knowledge