Font Size: a A A

Key Techniques Of Spatio-Textual Query Processing

Posted on:2016-08-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:S T LiuFull Text:PDF
GTID:1108330503956158Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the near ubiquity of GPS technology and the rapid development of mobile devices, there are a growing number of location-based services(LBS). These services generate large amounts of spatio-textual data which contain both geographical location and textual description. This brings two challenges to traditional query processing technology since it mainly focuses on textual data:(1) The performance of search usually relies on the quality of the underlying data. Consider the characteristics of spatio-textual data,how to effectively integrate data from multiple sources and eliminate those similar ones.(2) When executing different kinds of spatio-textual queries, how to concurrently utilize the spatial location and textual descriptions to optimize the algorithms and improve the performance. The contributions of this dissertation are summarized as follows.1. Spatio-textual data integration: traditional algorithms in data integration only focus on spatial data or textual data. To solve this problem, we propose an e?cient hybridprefix based technique. On the one hand, for the spatial component, we devise an MBRprefix based algorithm to provide e?cient pruning. Utilizing the spatial threshold, it can select a specific sub-region for each object to generate its representative spatial signature.Since the selected region is much smaller than the original MBR, this algorithm can quickly find the candidates. On the other hand, we design a hybrid-prefix based algorithm. It can improve the index utilization by combining infrequent keywords. Furthermore, this method can generate different spatial partitions according to the location distribution of keywords. Then it strengthens the pruning power through adaptively combining spatial and textual prefixes.2. Top-k spatio-textual search: traditional methods did not consider the optimization of textual components. To solve this problem, we propose a partition-based algorithm.Based on the idea of threshold algorithm, it incrementally finds the object with current highest spatial similarity or textual similarity. It then dynamically combines these objects to get the final results. In the stage of building index, we divide objects into buckets according to their spatial location and textual similarity intervals. We take each bucket as a group and estimate its similarity. Within each bucket, we execute the Top-k query and combine these results finally. In this way, we can first search buckets with high spatiotextual similarity and avoid visiting large amounts of useless data.3. Top-k spatio-textual similarity search: traditional algorithms cannot concurrently support the “ character-based error-tolerance ” and “ Top-k ” requirements. To solve this problem, we propose a hybrid hierarchical index called HLtree. It can dynamically select landmarks which can best represent the data distribution. Utilizing these landmarks,objects are divided into different partitions. To support multiple keyword search, the algorithm incrementally finds the nearest object for each query keyword and combine them using e?cient strategies. Furthermore, to avoid calculating all the similarity between objects and landmarks, the algorithm devises a character-based deletion method and utilizes index to generate partitions. Using these optimizations, it can e?ciently accelerate the process of indexing and query processing.
Keywords/Search Tags:Spatio-Textual Data, Data Integration, Data Search, Location-based Services
PDF Full Text Request
Related items