Font Size: a A A

Research On Mining Of Co-location Patterns

Posted on:2013-06-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:F QianFull Text:PDF
GTID:1228330395989260Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years, the rapid development of mobile device, camera application, RFID, wireless sensor and spatial database in personal information services, public services and scientific researches is leading to an increasingly tremendous storage of space-related data, which requires the discovery of useful and potential information from them. Spatial data mining extracts the implicit and non-trivial information from large spatial data sets that discovers the potential requirements of customers, so as to provide them with more humanistic and intelligent services. In this thesis, we focus on an important task of spatial data mining namely, the mining of co-location patterns that discovers the subsets of features whose events are frequently located together in geographic space. It has been applied in many areas like mobile commerce, earth science, biology, public health, and transportation, in which the discovered dependency relationship of features are widely used.Due to its generality and diversity, the co-location pattern mining problem has different representations in different types of data sets and applications. Based on the current research on co-location pattern mining, we analyze the limitations of the conventional approach and explore the more effective and efficient mining theories and techniques in the following four ways, namely the spatial co-location pattern mining, the regional co-location pattern mining, the spatio-temporal co-location pattern mining and the co-location trajectory pattern mining. In summary, we have the contributions as follows.(1) The first part of this thesis discusses the limitations of the conventional approach using distance threshold and prevalence threshold, and proposes to dynamically construct the neighborhood relationship graph. Based on the analysis of the co-locating neighborhood relationship, we propose a novel prevalence reward that drives the iterative mining process. The experimental results on real world data sets indicate that our proposed iterative mining framework is effective in terms of prevalent co-location discovery. (2) The second part of the thesis further analyzes the limitations of adopting distance threshold in spatial data sets with various densities, especially for the discovery of regional co-location patterns. We then propose a hierarchical mining framework to find regional co-locations using k-nearest neighbor graph instead of distance threshold, which takes the varying density into consideration. A novel distance variation coefficient is also proposed to drive the mining algorithm, which controls the distance variation of regions, and meanwhile avoids the predefining of k for each region. The experimental results on both synthetic and real world data sets indicate that our mining framework effectively finds regional co-locations, which might be over-estimated or under-estimated by the conventional approach.(3) The third part of the thesis proposes a weighted sliding window model to discover the spatio-temporal co-location patterns. The previous researches on this topic either treat the time factor as an alternative dimension or simply carry out the mining process on each time segment. However, the former approach leads to high computational complexity while the latter approach may overlook the co-locations between the segments. Compared with these approaches, our weighted sliding window model is more general, which measures the prevalence of co-locations with time interval penalty. It takes the direction of time into consideration, and also reduces the computational cost. The experimental evaluation on both synthetic and real world data sets shows that our algorithm is effective and scalable for finding spatio-temporal co-locations.(4) The last part of the thesis proposes the problem definition of mining co-location trajectory patterns, together with its domain applications. We propose a novel co-location trajectory tree to index the spread elements of patterns, which holds the monotonic property with the co-location size. A computationally efficient algorithm is proposed and proved to be correct and complete, whose computational complexity is then analyzed. The experimental evaluation on both synthetic and real world data sets shows that our algorithm is effective and efficient for mining co-location trajectories.
Keywords/Search Tags:Spatial data mining, Co-location pattern, Co-location trajectory pattern, Regional pattern mining, Iterative framework, Hierarchical framework, κ-nearestneighbor graph
PDF Full Text Request
Related items