Font Size: a A A

Research On Mining Spatial Co-location Patterns Based On Region Partition

Posted on:2019-10-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:J S ZhaoFull Text:PDF
GTID:1360330548973367Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development and wide application of spatial information technology represented by remote sensing,geography information system and global positioning system,a large number of spatial data containing location information are generated.Spatial data mining is a process of finding interesting,unknown and potentially useful knowledge and patterns from a great many spatial data.This thesis,centering on spatial co-location mining,is to discover a set of spatial features of which the instances frequently appear in a spatial neighborhood of each other.As an important task of spatial data mining,the research of spatial co-location pattern mining is applied extensively in ecology,environmental protection,public security,public health,urban planning,transportation and services based on location and so on.Due to the correlation and heterogeneity of the spatial data,the diversity of data and the actual application needs,this thesis is to introduce spatial distribution characteristics of spatial patterns into spatial co-location pattern mining from two aspects based on space region partition,to explore even co-location pattern mining,prevalent and even co-location pattern mining,high utility co-location pattern mining based on importance of spatial regions and parallel mining.The main research contents and contributions are summarized as follows:1.The traditional approach of mining co-location patterns based on participation index ignores the distribution characteristics of spatial patterns.The pattern entropy approach has disadvantages of single region partition and being difficult to set the threshold.This thesis defines the evenness coefficient of patterns to describe the distribution of the pattern instances in the space,discusses the grid partition and cluster partition of spatial region and puts forward specific methods to implement.Thus,in this thesis,the problems of mining even co-location patterns and mining algorithms are proposed.Meanwhile,based on the experiment of synthetic and real world datasets,the proposed algorithm is evaluated and compared with the traditional approach and the pattern entropy approach.2.Two strategies are proposed in mining co-location patterns,in which both the prevalence of co-location patterns and the distribution evenness of the pattern instances are considered.Besides,the prevalent and even co-location patterns and weighted evenness patterns are defined in this thesis too.Strategy 1 is post-mining of even patterns based on the prevalent patterns.Strategy 2 integrates the two parameters of participation index and even coefficient and defines the weighted evenness as a new interest measure of weighted even co-location pattern mining.In these two strategies,the antimonotone of the participation index is used directly and indirectly.In the process of generating candidate patterns,pruning technology is introduced.The results of experiments on synthetic and real datasets show that Strategy 1 can effectively reduce the number of prevalent co-location patterns and weighted even co-location patterns discovered by Strategy 2 is more meaningful because it effectively combines the prevalence and evenness of the patterns.3.Considering the regional importance,the regional utility is transformed into the instance utility,and the definition of the interest measurement of high utility co-location pattern mining is given.This thesis proposes a high utility co-location pattern mining problem and basic algorithm based on regional importance.In order to reduce the computational cost,the utility of spatial region is sorted from high to low and the improved algorithm with a pruning strategy is presented.The experiment on the synthesis and real datasets evaluates the high utility co-location pattern based on regional importance and compares the experimental results with the traditional co-location pattern mining.Meanwhile,the efficiency and scalability of the basic algorithm and the improved algorithm are compared.4.Because of limited memory capacity and computing power,a single computer cannot effectively conduct the high utility co-location pattern mining for large data sets.Based on MapReduce,a parallel processing framework and an algorithm of high utility co-location pattern mining are proposed.The experiments on the synthesis and real data sets show that in the small data set,when the distance threshold is larger and the threshold value of utility participation is smaller,the efficiency of the parallel algorithm is better than that of the serial algorithm.In addition,the parallel approach can handle large data and have high scalability.
Keywords/Search Tags:Spatial data mining, Region partition, Spatial co-location pattern, Evenness coefficient, High utility pattern, MapReduce framework, Spatial distribution characteristics
PDF Full Text Request
Related items