Font Size: a A A

The discovery of multiple-level profile association rules

Posted on:2003-01-16Degree:Ph.DType:Dissertation
University:The University of MississippiCandidate:Bland, Charles EarlFull Text:PDF
GTID:1468390011481912Subject:Computer Science
Abstract/Summary:
Knowledge discovery in databases (KDD) or data mining is the field of study concerned with developing methods capable of efficiently analyzing very large datasets. Our research is focused on an area of data mining known as association discovery. A commonly used method for identifying associations is association rules. An association rule is a rule of the form A B, where A and B are sets of items. This rule implies that when A occurs in a dataset B will also occur, with a certain probability. The traditional association rule problem can be extended to finding associations in that they present different views of data, giving insight that may not be possible with traditional association rules.; Techniques for discovering association rules have traditionally focused on identifying relationships between items describing some aspect of human behavior, usually buying behavior for determining items that customers purchase together. More than ever before, organizations are collecting personal information (profile information) associated with customer behavior. Considering this trend, in this study we take on the problem of incorporating profile information into association rule discovery. In addition, we study this problem at multiple In generating multiple-level profile association rules, we were faced with two major problems: (1) representing knowledge at multiple levels mining dense datasets, which result from the inclusion of profile items. The first problem was addressed using a markup language known as XML to partition data hierarchically according to some user-specified categorization of dataset items. We addressed the second problem by introducing a new method for compressing transactional data using a bit representation. Our compression technique allowed fairly large datasets to fit into memory. This eliminated the need for multiple dataset scans for discovering association rules, resulting in faster processing time. We tested our design on two real-world datasets. Our design resulted in a significant reduction of dataset size and faster generation of association rules. We also demonstrated that multiple-level profile association rules are a useful way of understanding data.
Keywords/Search Tags:Association rules, Discovery, Data
Related items