Font Size: a A A

Density-Based Top-K Spatial Textual Clusters Retrieval On Road Network

Posted on:2024-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z WangFull Text:PDF
GTID:2542307178473734Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
5G technology facilitates the widespread use of location-based services(LBS).In the field of spatial data analysis,the combination of LBS with big data and machine learning technology has become an increasingly important research area.In terms of spatial and textual information,the top-k spatial textual cluster retrieval(k-STC)query that returns k high-density clusters is an emerging variant of spatial keyword query.As for now,k-STC only applies to Euclidean space.However,due to the limited connectivity of the road network,existing related methods cannot effectively solve the network-based k-STC problem,and may even lead to data bias or errors.Moreover,existing methods often use the text similarity value of the object in the cluster most relevant to the query keyword as a representative.This may lead to only a portion of the query keyword being considered.Research on k-STC variants can be applied to road network can expand the application scenarios and analysis,especially in decision-making and travel planning.It is expected to provide more accurate and practical data support for urban transportation,urban planning,environmental protection and other aspects.In order to extend the research and application of k-STC to road network and provide more refined query services for user,this paper studies a new form of k-STC on road networks(k-STC-network),and a more comprehensive measurement method is developed,which takes into account the location of spatio-textual objects,keyword information,as well as the density of objects distributed throughout the road network.The work of this thesis consists of the following two parts.The existing index for k-STC cannot be directly migrated to the road network.In order to implement spatial keyword query in large-scale road networks,this thesis proposes an efficient IG*-tree index structure based on G*-tree.Each node of IG*-tree includes an inverted file based on text and a distance matrix,which are used to store the textual and position information of the object.The leaf node also contains an inverted file based on distance in order to speed up the query.A basic query algorithm BM is proposed for handling k-STC networks.The experimental results in different datasets show that the proposed index structure is effective and has a certain application value.Furthermore,two optimization algorithms are proposed for the optimization of kSTC-network queries from different perspectives: Net DBSCAN and EXPDBSCAN.Net DBSCAN employs a variety of pruning strategies and upper limit scores to reduce the number of unqualified and irrelevant objects,and to increase the efficiency of queries.EXPDBSCAN adapts its algorithms to the density parameter based on the concept of area density,which enhances their usability.On real road network datasets,the proposed indexing and correlation algorithms have been demonstrated to be efficient and scalable.
Keywords/Search Tags:Spatial Keyword, DBSCAN, Road Network, Hybrid Index
PDF Full Text Request
Related items