Font Size: a A A

Research On Outlier Mining Based On Association Rules

Posted on:2008-11-02Degree:MasterType:Thesis
Country:ChinaCandidate:L L ZhangFull Text:PDF
GTID:2178360215496501Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The magnitude of the data that man has acquired exponentially increases with theimprovement of the data collection and storage technology. The information hiddenbehind these data about the general description of data' characters and the predictionof future development trend can be used for reference in the process ofdecision-making. Data mining, which is applied for the data' statistics, evaluation,synthesis and reasoning, can be used to dig out these hidden information.Outlier data mining widely used in daily life is a new field of data mining.Outlier data mining can help to find the true but unexpected information. Nowadaysthe technology of outlier data mining has attracted lots of attention of researchcommunities of database, machine learning and statistics.Association rules mining, which discovers previously unknown and interestingrelationships among attributes in the large databases, is another rising field of datamining. Many scholars in the research communities of database, artificial intelligenceand statistics, are drawn to the study of association rules. And lots of achievementshave been made.Most of traditional algorithms using for association rules mining are based on theclassical Apriori algorithm. The Apriori algorithm implements a bottom-up,breadth-first search, which is a time-consuming process, in order to enumerate everysingle frequent itemset. Therefore these traditional algorithms don't show satisfyingperformance when dealing with the dense databases.In this dissertation an improved algorithm, which is also based on Apriorialgorithm, is proposed. The essential improvements include: (a)introducing interest toremove the trivial rules; (b)calculating the power set of the 1-frequent itemset andusing linked list to store the transaction identifier in data structure. Theseimprovements reduce the time of scanning database to only once, which enhance thespeed and efficiency of data mining.The association rules mining aims at finding the itemsets which satisfy the minimum of support and the minimum of confidence .As a supplement, outlier miningaims at finding the itemsets which satisfy the maximum of support. In this dissertationthe association rules are combined into the outlier data mining. The new hybridalgorithm shows satisfying performance when benchmarking.The outlier data mining system can act as a bridge between outlier data miningtheory and application. More and more research focuses on its design and implementin recent. An outlier data mining architecture, based on the improved data miningalgorithm, is also presented. But the schemes to implement this architecture should bediscussed in detail in the future.
Keywords/Search Tags:association rules mining, frequent itemset, interest, outlier, outlier mining
PDF Full Text Request
Related items