Font Size: a A A

The Research On The Muti-keywords Search Technology Over P2P Network Based On Bloom Filter

Posted on:2013-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:M F WangFull Text:PDF
GTID:2248330395484910Subject:Software engineering
Abstract/Summary:PDF Full Text Request
There are two basic modes for the existing multi-keyword searching over P2Pnetwork based on bloom filter (BF), which are “and query” and “or query”, it hasreduced the network traffic generated by the retrieval process by bloom filterencoding effectively. However, in P2P networks, the joining and leaving of nodes isfree, the documents stored on the nodes will be inserted or removed as theparticipation or the departure of nodes. However, bloom filter does not support deleteoperation, when documents removed from the network frequently, the system needs tore-construct bloom filter to accommodate the remove of documents constantly. It willlead an additional system overhead. In addition, The traditional search engine supportthat before the search words prefaced by the plus to show that the search results mustinclude this keyword, while with minus signs show that search results could notinclude this keyword. However, in the multi-keyword search in P2P networks have notstudied this query mode.In this paper, with P2P multi-keyword search mechanism asthe research object, we focus on the dynamic characteristic of P2P networks, a lot ofnetwork traffic generated by the retrieval process and the expansion for P2Pmulti-keyword search mechanism, The main contributions of this paper are asfollows:Firstly, an “and query” model based on counting bloom filter (CBF) is proposedin this paper. Using counting bloom filter storage keywords index table, countingbloom filters support the delete operation, and can easily be converted to the bloomfilter.So, this model can adapt to system’s churn and reduce traffic. Simulation resultsshow that this model outperforms current “and query” model based on bloom filter inboth the recall rate and the query latency, when the nodes join and leave from the P2Pnetworks frequently, in the hit rate,it has improved5%to10%,At the same time in thequery latency, it has reduced10ms~40ms.Secondly, an “or query” model based on counting bloom filter (CBF) is proposedin this paper. Simulation results show that this model outperforms current “or query”model based on bloom filter in both the recall rate and the query latency, when thenodes join and leave from the P2P networks frequently, in the hit rate,it has improved7%to11%,At the same time in the query latency, it has reduced12ms~45ms.Thirdly, an “sub query” model based on bloom filter (BF) is proposed in this paper. Simulation results show that this model outperforms “sub query” model stragthforward the set of keywords in both the recall rate and the query latency. in the hitrate,it has improved7%to15%, in the query latency, it has reduced10ms~37ms, Atthe same time, in the production of network traffic, it has reduced53.01%.
Keywords/Search Tags:P2P networks, Multi-keywords searching, DHT, Bloom Filter, CountingBloom Filter, Inverted index, Sub query
PDF Full Text Request
Related items