Font Size: a A A

Multi-scale Association Rules Mining Method

Posted on:2016-12-09Degree:MasterType:Thesis
Country:ChinaCandidate:M M LiuFull Text:PDF
GTID:2308330461977436Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Association rules mining is an important part of data mining, whose ultimate goal is to seek the potential and frequent patterns or correlations hidden behind data. Multi-scale science is an emerging research field whose essence is as follows: analyzing the multi-level and multi-scale structural characteristics of research objects, exploring the root cause of multi-scale expressions and discussing the deep relationships among different scale expressions. Multi-scale theory has been introduced into spatial data mining, and many elementary researches on multi-scale features of spatial data have been done. This paper introduced multi-scale theory into data mining, and further more, pushed it to more extensive data types. Taking association rules mining as the pointcut, this paper conducted a study of universal multi-scale data mining on theoretical and methodological aspects. Referencing and centring on the studying essence of multi-scale science, we researched multi-scale data theory which bases on related concepts as the principal parts. We put forward the process framework of multi-scale data mining, and proposed scaling-up mining algorithm and scaling-down mining algorithm of multi-scale association rules mining on the basis of the theory and framework mentioned above. The proposed algorithms realized multi-scale mining for association rules, and provided theoretical and methodological support for the multi-scale decision of users.This paper took multi-scale association rules mining as the studying essence, whose main contents are as follows:1. Researches on the multi-scale data mining theory.To overcome the limitation that there is still a lack of universal and integrated theoretical foundation in multi-scale data mining field, we conducted a study of multi-scale data mining theory in three major aspects: multi-scale data, multi-scale data mining and multi-scale data mining process framework. Firstly, we put forward the definitions of data-scale-partition, data-scale and unit-scale dataset on the basis of concept hierarchy. Following those definitions, this paper also brought forward four kinds of relationships between multi-scale datasets: ancestor and descendant datasets, father and son datasets, sibling datasets, upper-layer and lower-layer datasets respectively, after which the concept system of multi-scale data was formed preliminarily. Secondly, we gave the definition of multi-scale data mining, illustrated the scale convert for knowledge as the studying essence of multi-scale data mining, classified the multi-scale data mining algorithm into two aspects: scaling-up mining algorithm and scaling-down mining algorithm on the basis of the generalized classification of scale convert, and confirmed the essence and direction of multi-scale data mining. Lastly, we established a multi-scale data mining process framework in stages, which is used to guide and standardize the process of multi-scale data mining.2. The proposal of scaling-up association rules mining algorithm.To make up for deficiencies that there is still no explicit multi-scale data mining algorithm, aiming at association rules mining and centring on scale convert for knowledge, we proposed an algorithm named SU-ARMA(Scaling-Up Association Rules Mining Algorithm) on the basis of sampling theory and Jaccard similarity coefficient, which realized scaling-up convert for knowledge among multi-scale datasets.3. The proposal of scaling-down association rules mining algorithm.Aiming at association rules mining and centring on scale convert for knowledge as well, we proposed an algorithm named SD-ARMA(Scaling-Down Association Rules Mining Algorithm) on the basis of inverse distance weighing in interpolation method. And SD-ARMA realized scaling-down convert for knowledge among multi-scale datasets. And the confidence interval for error rate of SU-ARMA and SD-ARMA was deduced and proved with the help of statistical principle and machine learning theory as well. Further more, we analyzed the advantages of SU-ARMA and SD-ARMA compared with traditional association rules mining methods, and elucidated their applicable domains.4. Verification experiments on the multi-scale data theory and multi-scale association rules mining algorithm.Algorithms of SU-ARMA and SD-ARMA were applied to IBM T10I4D100 K synthetic dataset and demographic dataset from H province whose multi-scale features are obvious. The experimental results turn out that SU-ARMA and SD-ARMA have better coverage rate and accuracy, lower average support error, and their efficiency is also better than traditional way of applying Apriori directly. Algorithms of SU-ARMA and SD-ARMA are feasible and efficient.
Keywords/Search Tags:Multi-scale, frequent itemset, association rules, scale convert, multi-scale association rules mining
PDF Full Text Request
Related items