Font Size: a A A

Study On Top-k Query Processing Over Uncertain Data In P2P Network

Posted on:2011-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2248330395458341Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Although study on Top-k query processing over certain data in centralized environments have been studied widely in recent yeas, with the deepening of understanding in the objective world, the field of uncertain data has been pay more attention; with the increasing development of network, data become more and more distribution. Therefore, Top-k queries over uncertain data in distributed environments become a new challenge.There arise inherent uncertainties and fruzzy on data information in many real-world applications due to the imprecise measurement tools, test environment and network delay in distributed environment, such as sensor network, Peer-to-Peer (P2P) systems and etc. Distributed applications are more and more popular, data collected is uncertain largely, there exist inherent differences between uncertain data and certain data. Therefore, it is necessary to solve the problem how to retrieve the global Top-k queries over uncertain data in distributed network efficiently.For these purposes, we present a noval approach to process uncertain data Top-k query in large-scale P2P network in this thesis, where dataset is horizontally partitioned over the peers and using the super-peer network framework. In P2P network, each peer constructs indices for its local uncertain data using grid firstly. Moreover, each super-peer index summary information of all the uncertain data in whole network, called global index. Based on the global index, we introduce an effective global pruning strategy on super-peer to help reduce large number of communication and computation costs. Then we propose a local pruning approach on peer based on the relationship between the scores of uncertain data to reduce the computation costs. After that, we propose an efficient algorithm to find out results with minimum communication and computation costs. Finally, extensive simulation experiments have been conducted to show the efficiency of our proposed approach in terms of communication costs and response time.
Keywords/Search Tags:P2P, Top-k, uncertain data, grid index
PDF Full Text Request
Related items