| Traditional spatial co-location pattern mining often produces a large number of redundant patterns due to technological limitations,making it difficult for users to store and analyze the results.This limitation is manifested in three ways: Firstly,the value of spatial co-location patterns is measured solely by their participation degree,and an unreasonable threshold can result in significant bias in the results obtained.Secondly,since co-location patterns are collections of spatial features,the number of patterns generated by mining increases exponentially with the increase of spatial features,making storage difficult.Thirdly,the process of mining high-order patterns from low-order patterns produces a large number of redundant co-location patterns.This paper proposes a study of co-location pattern compression methods from the perspective of compression ability and time efficiency to address the aforementioned problems.The following specific work is conducted:(1)To address the redundancy of co-location patterns,similarity measures for patterns are proposed based on both feature and instance.Traditional co-location pattern compression methods often use a single criterion to measure whether a pattern is redundant,which makes it difficult to achieve a reasonable balance between compression ability and information preservation.Therefore,this article proposes instance similarity and feature similarity to objectively evaluate the similarity between two co-location patterns based on the two aspects of information expressed by the co-location patterns.(2)The completely neighbouring clique mining technology is introduced to improve the pattern mining method.Existing spatial co-location patterns typically use instance linking or neighbor table construction to find pattern instances.However,these two methods are not only inefficient,but also result in duplicate instances.The completely neighbouring clique mining technology can directly mine all neighbouring clique cliques in space and classify them according to the features of the instances that make up the cliques,building a hash table for easy querying when calculating pattern participation.Since pattern instances themselves are also a special type of clique,they must belong to a certain completely neighbouring clique,and this method can avoid generating duplicate instances.(3)Co-location Pattern Compression Algorithm is designed.Based on the introduced technique of complete neighbor clique mining,this article first proposes an efficient SPIclosed co-location pattern mining algorithm named NRCPM.By storing complete neighbor cliques in a special data structure called clique hash table,the computation speed of participation and superparticipation can be significantly improved,thereby efficiently mining SPI-closed co-location patterns and achieving high-efficiency compression.Subsequently,based on the proposed feature and distribution similarity measures,a novel representative co-location pattern extraction algorithm named RCPE is proposed,which designs different compression algorithms based on the characteristics of maximal and nonmaximal co-location patterns.In maximal co-location patterns,representative maximal colocation patterns are extracted using feature similarity clustering and distribution similarity screening steps,while in non-maximal co-location patterns,valuable non-maximal colocation patterns are extracted using distribution similarity filtering. |