Font Size: a A A

Spatio-textual Similarity Joinwith Threshold Constraint In Mapreduce

Posted on:2016-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:P JinFull Text:PDF
GTID:2308330479951067Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of mobile positioning technology and geographic information systems, location-based services are widely used in people’s lives. In the traditional spatial database query, users only care about their location information.however, with the need of the user, the user raise a new query for spatial data, which contains location and keywords. The problem of spatial-textual similarity join is an extension of the query, and can be used in many applications, such as, friends recommendation in social networks, to form a variety of interest groups, users want to find the people who have similar interests and short distance. Such as shopping and travel. While existing works do not take full account of the different needs of users, this paper studies spatio-textual similarity join with threshold constraint in Map Reduce.Firstly, we propose k-nearest neighbor join with textual similarity threshold constraint in Map Reduce, given text similarity threshold, the operation returns k nearest neighbors for each object which meet the given threshold, we give algorithm of the spatial-textual join in Map Reduce frame, the algorithm contains three steps, generating the global ordering for textual signature, the local join of k nearest neighbors and the overall join of k nearest neighbors. We construct index by using the idea of multi-prefix, positive prefix and reverse prefix to improve the efficiency of search.Secondly, we propose top-k join with spatial distance threshold constraint in Map Reduce, the join space is divided into grids, in the map stage, the grid and the grids around are divided into the same reducer. In the reduce stage, inverted index which based on the upper bound of similarity is constructed for all objects. In the join stage, we use the k-th result to prune part of dissimilar objects to reduce the unnecessary computation, and improve the join speed.Finally, we test the methods using real data-set and random data-set to verify the effective of our proposed methods, and the experiments prove it is effective.
Keywords/Search Tags:spatio-textual similarity join, Map Reduce, R-tree, grid, threshold
PDF Full Text Request
Related items