Discovery of indirect association and its applications

Posted on:2003-01-09

Degree:Ph.D

Type:Thesis

University:University of Minnesota

Candidate:Tan, Pang-Ning

Full Text:PDF

GTID:2468390011980518

Subject:Computer Science

Abstract/Summary:

Data mining has become an essential data analysis tool as it provides an automated procedure for the rapid discovery of novel but implicit knowledge in large databases. One of the main techniques in data mining is association pattern discovery, which attempts to find items that occur together relatively frequently in the data. This technique has been successfully applied to various application domains including business decision support, telecommunication alarm diagnosis, and molecular genomics.; As the current association pattern discovery algorithms are focused towards finding frequent patterns, they fail to capture other forms of interesting multivariate relationships such as negative associations, which are equally valuable in many application domains. For instance, negative associations characterize the dependence relationships between competing products such as Huggies and Pampers, or the opposite outcomes of related events in an event sequence database such as FIRE_ALARM=ON but FIRE_SPRINKLER=OFF. Mining negative associations is a computationally expensive problem, especially for sparse transaction data, where a large percentage of the extracted patterns have low interest values.; This thesis introduces a new type of pattern called indirect association, which provides an effective way to discover interesting negative associations by extracting only “infrequent patterns that are expected to be frequent.” An efficient, level-wise algorithm for mining indirect associations is presented to address the computational issue. The second part of this thesis extends the concept of indirect association to sequential data. Sequential indirect association has been successfully applied to Web usage data to discover groups of Web users who share a similar browsing behavior.; Finally, every association pattern discovery task requires a metric to evaluate the interestingness of the discovered patterns. While many such metrics have been proposed in the data mining literature, the metric that is most consistent with the expectations of domain experts is rarely known. This dissertation provides an in-depth study of how to select the most appropriate metric for a given application. The results of this study will have an impact on association pattern discovery and all other data mining tasks that require the use of an objective measure for preprocessing, post-processing or within the mining algorithm itself.

Keywords/Search Tags:

Discovery, Data, Mining, Indirect association, Application

Related items

1	Sematic Association Discovery And Its Application From Linked Open Data
2	Research And Application Of Association Rule In Data Mining
3	Algorithm Optimization Research And Application Of Association Rule In Data Mining
4	The Application And Research Of Data Mining Technology In P2P's Discovery Mechanism
5	Research The Application Of Data Mining On Military Cadres' Culturing
6	Data Mining Techniques And Algorithms For Mining Association Rules
7	Web Data Mining Based On The Association Rules Discovery
8	The Research & Implement For Mining Association Rules Of Definite Semanteme
9	Design And Implementation Of Target Association Mining System In Mail Communication Network
10	Study On Parallel For Association Rules Mining