Font Size: a A A

Research On Private Data Retrieval Based On Topic Model In Cloud Storage

Posted on:2023-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y H WangFull Text:PDF
GTID:2558306908450574Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,it is difficult to ensure the privacy security of data stored in the third party.Individual users and enterprises will choose to encrypt data and then send it to the cloud,which makes the traditional plaintext based keyword retrieval scheme no longer applicable,so searchable encryption technology emerges.Nowadays symmetric searchable encryption schemes are mainly based on the connected relation between keywords and documents to directly build indexes to search.The retrieval structure does not dig deeply into the semantic information of documents,sometimes cannot return intelligent retrieval results to users.The index structure is complex and space costs a lot.Retrieval efficiency is easily affected by the increasing number of documents.The schema lacks validation of the returned files.Based on this,this paper studies symmetric searchable encryption scheme,and its main contributions are as follows:(1)Through the topic model,the connected relation between documents and keywords is converted into the probability distribution value of document-topic-keyword.The training model deeply mines the latent semantic information of documents,reduces the data dimension to the topic layer,obtains word distribution based on the topic and categorizes related documents.In the experiment,topic models and parameters suitable for the data set in this paper are selected to obtain document-topic distribution matrices and topic-word distribution matrices.(2)We propose a symmetric searchable encryption scheme based on topic model.The scheme builds a secondary index based on the document-topic-word distribution.A Bloom Filter is constructed in the topic-keyword layer to classify query keywords into corresponding topics,and an inverted index is constructed in the document-topic layer to return relevant file sets according to the topic.The document-topic probability distribution values obtained by the data processing model are sorted as the relevance between the topic and the document,and the top-K files under the topic are returned by semantic expansion.The index structure can support dynamic update of files.Because the file index is built based on the topic,the dimension of the data is reduced from the keyword layer to the topic layer,The experimental results show that the safe index reduces the space cost,and the retrieval time is related to the number of topics when the number of documents increases.(3)We propose an efficient searchable encryption scheme that verifies whether ciphertext sorting is correct.In fact,cloud servers may be affected by a variety of factors that prevent them from honestly executing algorithmic protocols to return a complete and correct set of documents.On the basis of the above scheme,we use multiple hash functions to generate verification evidence,which is stored in the document nodes of the topic sorting linked list.In this way,not only the integrity of all the returned file information can be verified,but also the correctness of the sorted files that return a specified number of files can be verified.The incremental computability of multi-set hash function ensures the efficiency of document nodes generating sorted evidence and can be better applied in dynamic searchable encryption schemes.By comparison,the scheme can realize more functions and verify good efficiency.
Keywords/Search Tags:Searchable encryption, Topic model, Bloom Filter, Inverted index, Verifiability
PDF Full Text Request
Related items