Font Size: a A A

Research And Implementation On Spatial Keyword Query Technology Towards Internet Location-based Service

Posted on:2013-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:J K PanFull Text:PDF
GTID:2298330422474293Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the prevalence of the mobile Internet and Web2.0, more and more applications aregaining the trend of blending the geograiphical information and textual information. On theone hand, more applications are providing users with location-based services. On the otherhand, text content can be paired with geographical information by the presence of featuressuch as place names or street addresses. This gives significance to spatial keyword queries.The existing spatial keyword queries focus on exact match or prefix match of the keywords,whereas the wildcard based imprecise match has more potential in many realistic scenes. Thebulk loading of the index can make use of the known information of data to build the index,and improve the efficiency of the storage and the query. However, the existing bulk loadingalgorithms all aim at spatial queries, not having taken the advantage of the knowninformation of the keywords. Along with the prevalence of the information technology, thescale of the data in every domain is incresing rapidly, which brings about the desire of thehigh performance, scalability, dependability and other aspects of the data processing. Aimingto solve such problems, researches are carried out on three respects which are spatialkeyword queries supporting wildcard, the bulk loading of spatial keyword queries and spatialkeyword queries in HBase.Firstly, two methods are put forward to implement spatial keyword queries supportingwildcard with the help of the R-tree and Permuterm index. The inverted file and R-treeintegrited index (IRII for short) can prune using both spatial and keyword information whilevisiting the nodes, thus improves the efficiency of query greatly and fits for the situationwhich requires high time efficiency. The Prefix Bloom Filter and R-tree integrited index(PBFRII for short) uses Prefix Bloom Filter insead of Inverted file, gaining the efficiency ofspace at the cost of time efficiency, and fits for the situation requiring high space efficiency.Secondly, the TPA algorithm concerning on the bulk loading of the R-tree based spatialkeyword index is proposed. When loading the data, the algorithm considers not only thespatial factor such as the perimeter and overlap of the nodes’ minimum boungding rectangle,but also the overlap of the keywords between nodes. The theoretical analysis of the algorithmis conducted as well. Results of experimental evaluation demonstrate the high performance ofour method comparing with the traditional algorithms TGS and STR.Lastly, the implementation of spatial keyword queries in Hbase is presented. The designof the storage format makes use of the index and sort feature in row key and the clustingfeature of the space-filling curve. The MapRedcue parallel computing model is used whenloading the data to enhance the efficiency. To handle the query and process of spatialkeyword data in HBase, MapReduce, Filter and Index three modes are proposed. Theefficiency of our methods are validated by experiments.
Keywords/Search Tags:Spatial Keyword Query, Index, Wildcard, Bulk Loading, HBase
PDF Full Text Request
Related items