Font Size: a A A

Research On Data Security Deduplication In Cloud Storage

Posted on:2022-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:X L MuFull Text:PDF
GTID:2518306566991059Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,data has increased greatly,in order to manage and save data,users must to consume lots of human cost and material resources,so cloud storage technology arises at the historical moment.Users hold the same data among themselves,so the cloud has to store massive duplicate data,resulting in data redundancy and waste of cloud storage resources,and at the same time greatly reducing the efficiency of data transmission.The emergence of data de-duplication technology not only saves storage resources and bandwidth,but also causes problems such as low de-duplication efficiency and data leakage:(1)Many scheduling conflict of computer will appear in de-duplication.How to solve the scheduling conflict while protecting user data privacy and improve the efficiency of de-duplication operation.The existing de-duplication schemes have not considered this practical problem in terms of de-duplication efficiency,so that the de-duplication efficiency needs to be improved.(2)In the current existing schemes,it is not considered whether ordinary de-duplication methods will cause the leakage of internal data of users of special groups.To solve the above problems,we have proposed the following two solutions:A de-duplication operation scheduling scheme based on LSTM networks is proposed,which solves the scheduling conflict problem generated in the de-duplication process and improves the efficiency of de-duplication operations while protecting user data privacy.For the first time,I explored how to improve the efficiency of de-duplication by solving the problem of computer scheduling conflicts,and trained a predictor based on LSTM,a long-short-term memory network,which can predict this machine in the future based on the historical operation of the cloud server The server scheduling situation,generating scheduling prediction results,and giving executable operation sequence recommendations based on the prediction results,reasonably scheduling server processes and performing de-duplication operations based on the operational sequences.This paper proposes a group user data de-duplication scheme based on density clustering,which solves the problem that group users,i.e.users with similar attribute similarity,have an impact on the popularity threshold in the process of uploading data,and avoids the data leakage caused by cloud server in group data de-duplication.At the same time,the scheme has the function of data recovery,which is to assist the user to recover data when data loss occurs.Group users and individual users are classified by numerical user attributes,and the group identification of subsequent newly uploaded users is carried out with the classification results.Combined with the popularity threshold,dynamic counting and updating are carried out to ensure that new uploaded users will not change the current popularity threshold and ensure data security.
Keywords/Search Tags:Data-Deduplication, Scheduling Optimization, Prediction Model, Attribute Similarity, Popularity Threshold
PDF Full Text Request
Related items