Font Size: a A A

A Research Of KNN Query For Uncertain Graph

Posted on:2017-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:L M HeFull Text:PDF
GTID:2308330503983618Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As an important and representative data structure, graph can be used to describe the complex relationship among different areas. In the information age of data extremely fast-growing, its uncertainty is more and more common. How to effectively deal with the uncertain graph data is a huge project.For the research of uncertain graph, many ideas are derived from certain graphs which can be seen as a special uncertain graph with an edge probability of 1. In many cases, the study over graph cannot avoid distance, but because of its own characteristics, the calculation of the distance becomes more complicated. In the research of this paper, we mainly study the problem of distance and efficiency.Firstly, by analyzing the existing research on distance, we point out the main encountered problems in the related research of uncertain graph, one of which is the exact computation is NP hard problem, and the second is the paths with a same shortest distance in many cases are not independent.Secondly, based on the related calculation method about the distance, a new calculation method about reachable distance of uncertain graph is proposed. When reachable probability between the two given points is more than 0.5, we use expected shortest distance to measure the reachable distance, on the contrary, the length of the most likely path. In the existing research on the distance of uncertain graph, sampling method is the mostly used as the core processing method, but the number of samples has a great influence on the result which is volatile. In order to avoid sampling, two approximate methods are presented, one is based on arithmetic mean and another is based on weighted average. By merging theory with experiments, we verify that, when to find the reachable shortest distance in possible world, compared to the most popular dijsktra method, the advantage of BFS method is more outstanding.Finally, when the distance is applied to kNN query, the basic algorithm only takes into account the distance, while ignoring its important features. In order to make full use of the relationship between distance and probability, the index function is used to combine the two together skillfully. After these, emphasis factor is added to distinguish those nodes who are very similar to each other. For speeding up the efficiency of the algorithm in large scale graph, node cut method was used to reduce its size. Eventually, the experimental results show that the method has strong operability.
Keywords/Search Tags:Uncertain Graph, Reachable Distance, Probability, kNN Query, Node Cut
PDF Full Text Request
Related items