Font Size: a A A

Processing K-Nearest Neighbors Query Over Uncertain Graphs

Posted on:2013-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2248330374467147Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent decades, with the popularity of the Internet, the amount of information has been exponentially grown. Under this pressure, the traditional deterministic data management has a great deal of development, various of new technologies to deal with the analysis of data has been emerged, such as the current cloud-computing products and technologies, Hadoop, GFS, HDF, Bigtable, HBase, MapReduce and so on. As technology advances, people found a lot of data in the real world, where there is the uncertainty caused by a variety of reasons, such as the original data is not accurate, the use of coarse-grained data collection to meet the special applications, such as privacy protection, etc. These can lead to inaccurate data, but the traditional deterministic data processing method is not suit for uncertain data, uncertain data processing in this case has been widely appreciated, more and more methods and techniques to deal with uncertain data have been emerged.Complex network such as biological networks, social networks, communication network has been extensive research, these data with the uncertainty due to noise and errors introduced by the data out of, we can model these with uncertain graph models. k-nearest neighbors query problem is a query on a graph to find the nearest k neighbors, it is a fundamental problem about the uncertain graph. Uncertain graph on k-nearest neighbors query is still in its infancy, now the solution methods is relatively little. we propose to solve the uncertain graph k nearest neighbor query algorithm framework, and propose a novel distance metric, the metric can be a good solution for uncertainty graph’s k-nearest neighbors query. And then for the query, this paper proposes an efficient and accuracy algorithm, and then we further optimized the algorithm to further improve its efficiency. Although the determined algorithm has been greatly improved, when the amount of graph nodes and edges is large, using the deterministic algorithm to manage is a difficult thing, so we also proposed a method to calculate kMinDist based sampling algorithm, the algorithm can only give approximate results, but the algorithm has high efficiency, and we can improve the accuracy of the results by increasing the number of samples. Finally, our experiments confirmed that our algorithm is effective and efficient.
Keywords/Search Tags:uncertain data, uncertain graph, k-nearest neighbors, kMinDist
PDF Full Text Request
Related items