Font Size: a A A

Spatio-Textual Query Based On Density Clustering

Posted on:2020-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:C WangFull Text:PDF
GTID:2428330602452234Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the application of location-based services has developed rapidly.Users can retrieve information based on location,so as to get more realistic retrieval results.As the basis of location-based information retrieval,spatio-textual query has attracted more and more attention of scholars.In the field of spatio-textual query,scholars have proposed a variety of query methods to solve the problems in the real scene,but the existing query methods rarely take into account the influence of the surrounding interest points on the returned results.Based on Top-k spatial textual cluster query,this thesis proposes a spatio-textual query based on density clustering.This query method returns the result cluster that meets the density requirement for users.That is,it recommends regions to users,which meet the query conditions and are dense in spatio-textual objects.The query method proposed in this thesis solves the problems of high IO overhead and sensitive index structure parameters in Top-k spatial textual cluster query.This method first uses IR~2-tree index structure to index spatio-textual objects.Then,the IR~2-tree is searched according to the query keywords and the maximum acceptable distance,and the related spatio-textual object set is returned.Finally,density-based clustering algorithm is used to cluster the related object sets and return the final result clusters.This avoids traversing the entire data set and reduces the system IO overhead and query time.This thesis combines IR~2-tree index structure with traditional DBSCAN algorithm,and designs a clustered spatio-textual query algorithm based on DBSCAN.However,DBSCAN algorithm has high time complexity,which affects query efficiency.To solve this problem,this thesis proposes two different improved algorithms:(1)An improved rule-based clustering spatio-textual query algorithm.The algorithm uses rule strategy to reduce the number of extended objects in the?-neighborhood of the core object in DBSCAN algorithm,thus reducing the clustering time.(2)An improved clustered spatio-textual query algorithm based on fast DBSCAN.The algorithm establishes a grid structure according to the query conditions inputted by users,and reduces the time complexity of the algorithm by combining with the fast DBSCAN algorithm.Through the analysis of IR~2-tree index structure and the query mode in this thesis,it is found that when searching IR~2-tree,the nodes containing obvious abnormal objects can be removed.To solve this problem,this thesis proposes a clustering spatio-textual approximation query algorithm based on pruning strategy.Firstly,this thesis improves the IR~2-tree index structure by adding the markers of signature files in IR~2-tree.Secondly,this thesis designs a specific pruning strategy,prunes the search process of the improved IR~2-tree,and implements an approximate query algorithm.When searching the improved IR~2-tree,the approximate query algorithm can remove the nodes containing obvious abnormal objects in time at the top of the tree,which reduces the system IO overhead and IR~2-tree search time.In order to verify the impact of different query parameters on the running time and IO overhead of each algorithm,this thesis compares the algorithms designed in this thesis on two datasets of different scales.The experimental results show that under the same experimental conditions,the improved algorithm based on fast DBSCAN is better than other precise algorithms in this thesis.In addition,the proposed approximate query algorithm can effectively identify and prune the nodes containing abnormal objects,which further improves the performance of the algorithm.
Keywords/Search Tags:Spatio-textual data, Spatio-textual query, Density-based clustering, IR~2-tree
PDF Full Text Request
Related items