Font Size: a A A

Research On Association Rules And Application In Spatial Data Mining

Posted on:2008-07-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y B LiuFull Text:PDF
GTID:2178360215970776Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Today, we are immerged in overwhelming data. Data mining is a process to find underlying and useful information from a mass of incomplete, noisy, fuzzy and random data that people did not know before .Data mining is among the biggest concern in scientific research as well as the development and application about database. Association rule mining is one of the major research issues in data mining and of tremendous significance in business, science and other domains.With development of the spatial technique, the demand for discovering interesting and potentially useful knowledge from spatial database is on the rise. The technique of data mining in Spatial databases—Spatial data mining came into existence consequently. This paper is the analysis of spatial databases for association rules.The main works in the paper are as follow:(1) First, some basic theories and techniques of data mining, association rules and Spatial data mining are reviewed. The algorithms of association rule mining is discussed, covering issues ranging from Apriori algorithm to FP-growth algorithms. Some problems that exist in the algorithm are also discussed. (2) Integrating some of the existing technology, we present an algorithm, CFPmine, which is inspired by several previous works. First, this algorithm uses constrained subtrees of a compact FP-tree to mine frequent pattern, so that conditional FP-trees in the mining process can be avoided and memory consume reduced. Second, an array-based technique is taken into consideration to reduce the traverse time to the CFP-tree. The experimental evaluation shows that CFPmine algorithm exhibits favorable performance. It outperforms Apriori, Eclat and FP-growth and requires less memory than FP-growth.(3) Then, the detailed analysis of MBS A algorithm (based on the bitset-mapping) including its pros and cons are put forward. Given the shortcoming, this paper presents TP-PB algorithm (Two Phase Association Rule Algorithm based Partitioning and BitSet with Apriori property) instead of MBSA algorithm which is not suitable for large-scale spatial data mining. TP-PB algorithm has been proved to be accurate and effective by the test of both theories and the experiments. As the result of using the Partition technology, the efficiency of TP-PB algorithm proves to be higher in dealing with large-scale spatial data. A better result is achieved when applying the TP-PB algorithm to the precision agriculture.
Keywords/Search Tags:Data Mining, Association Rules, Spatial Data Mining, Apriori algorithm, CFPmine algorithm, MBSA algorithm, TP-PB algorithm
PDF Full Text Request
Related items