Font Size: a A A

Research On Spatial Co-location Pattern Mining Based On Spatial Compression Cliques

Posted on:2022-07-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:W H T r a n V a n H a Full Text:PDF
GTID:1528306335495074Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Prevalent spatial co-location patterns are an important task in spatial data mining.They expose the distribution rules of spatial features and have extremely high application values in many fields.Extracting spatial co-location patterns from spatial data sets is more complicated and difficult than finding frequent item sets in traditional transactional data sets since spatial instances in space have complex relationships such as implicit,continuous,and autocorrelation.This thesis conducts in-depth research from the essence of spatial co-location patterns which are enumerating and collecting spatial clique instances.We achieved the following results:(1)Address the shortcomings of the existing generate-test candidate mining framework,that is,when the user adjusts the prevalence threshold,the mining process has to re-generate candidates and re-collect spatial cliques of these candidates,propose an overlapping clique-based spatial co-location pattern mining algorithm.This algorithm only needs to collect the spatial clique instances of all patterns once,when users change the prevalence threshold,prevalent co-location patterns can be quickly filtered without re-collecting their spatial clique instances.(2)The premise of enumerating and collecting spatial clique instances is to require users set a distance threshold to materialize neighbor relationships of spatial instances.Although users can give a distance threshold according to their mining task,the first law of geography indicates that the proximity relationship between spatial instances in a space should be determined by their distribution in space.Thus,we propose an algorithm to materialize neighbor relationships of spatial instances without distance thresholds.This algorithm can not only automatically determine the neighbor relationships of spatial instances according to the distribution of them in space,but also effectively solve the problem that a single distance threshold cannot accurately obtain the neighbor relationships of instances in heterogeneous spatial data sets.(3)To aim at a large amount of memory that are required to store spatial clique instances and reducing redundant patterns in the mining result,a maximal spatial clique and hash table-based maximal spatial co-location pattern mining algorithm is proposed.All spatial cliques of patterns are compressed into a set of maximal cliques,these maximal cliques further are compressed into a hash table structure,thus,the memory consumption is very small.At the same time,based on maximal spatial co-location patterns,the mining result can be represented in the most concise and compact manner.(4)Consider the non-spatial attribute of the spatial instance---utility value,an efficient algorithm for mining spatial high utility co-location patterns is proposed.Unlike the previous algorithms,this algorithm does not separately enumerate and save the spatial cliques of each pattern,only holds a few spatial cliques to compress all neighbor relationships of all instances,and then these spatial cliques are further compressed into a hash table structure.Thus,the memory consumption is further decreased.By using a query method on the hash table to quickly obtain the participating instances of patterns,so spatial high utility co-location patterns can be efficiently filtered.
Keywords/Search Tags:Spatial data mining, Spatial co-location pattern, Spatial clique, Maximal pattern, High utility pattern
PDF Full Text Request
Related items