Font Size: a A A

Research On Spatio-textual K Nearest Neighbor Joins

Posted on:2016-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:M Y YangFull Text:PDF
GTID:2308330476953316Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years, with the development and popularization of intelligent portable devices, social network and wireless sensor networks, a variety of location-based services and applications emerge. A lot of spatio-textual data that contain spatial and textual information simultaneously are generated. How to design useful query types and effcient processing method for them, is a very meaningful research direction. In this paper, we propose a innovative problem: spatio-textual k-nearest neighbor joins.It combines spatial k-nearest neighbor query and join, and text similarity query and join. It can greatly enrich our operations on spatio-textual data, and provide better functions for many services and applications. With the scale of data increasing, the traditional centralized processing mode meets performance bottleneck. Therefore, it is very necessary to handle this problem in distributed computing environment.In this paper, we first discuss the source and outlook of the research problem. Then we review the researches, techniques and approaches of the related fields, including spatial k-nearest neighbor query and join, text similarity query and join, and spatiotextual similarity query and join. We give a brief introduction to distributed processing framework MapReduce and its popular open source implementation Hadoop. Then we formalize the problem of spatio-textual k-nearest neighbor join, and two MapReduce framework based approaches are proposed to solve it. One is a naive appoach based on block nested loop and another is an improved approach based on filter-and-refine framework. We conduct experiments on a distributed cluster and experimental results show that the proposed approaches are feasible when handing spatio-textual data, and the improved approach achieves better query performance then the naive approach.
Keywords/Search Tags:data query, distributed computing, spatio-textual data, k-nearest neighbor join
PDF Full Text Request
Related items