Font Size: a A A

Multiple Correlated Differential Privacy Matrix Factorization For Non-i.i.d Data

Posted on:2018-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:X C FuFull Text:PDF
GTID:2348330542958182Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet applications,especially the large coverage of mobile terminals and wireless networks,e-commerce users and service providers have a huge demand for accurate access and dissemination of information,and recommender system is widely used as an effective way to solve the problem of information overload caused by mass information.Meanwhile,the widespread use of recommender systems allows users to easily acquire a large amount of recommended information which meets their interest,however these information usually contains sensitive information about individuals or business organizations,so using recommender systems for information mining may have the risk of revealing privacy of all data providers.For example,a malicious attacker can actively obtain some recommended information about the target from his hobby or other associations,and infers sensitive information about the target from these recommendation information.Therefore,in order to prevent potential risks of privacy disclosure,recommender systems generally require to use effective privacy protection methods for data cleansing.In recent years,the differential privacy model which based on statistical methods has favored by more and more researchers and has been widely used to research works,because the model has a strict mathematical definition and can provide protection against attacks based on strong background knowledge.In the existing works,most recommender systems and privacy protection research of recommender systems are based on the assumption that data are independent of each other,i.e.,data is independent and identically distributed(i.i.d data).As for non-i.i.d data,there are two problems in traditional recommender systems privacy protection methods:first,for non-i.i.d data,complex associations in data make standard differential privacy impossible to combat relational inference attacks,however with the intensive study of recommendation algorithms,a variety of related properties have been introduced into recommender systems to improve the effect of recommender systems;second,existing differential privacy improvement methods for correlated data need to add excessive random disturbance noise,in some relatively simple scenarios(e.g.relational database)information loss caused by excessive noise is acceptable,but for the data of recommendation system with high-dimensional and extremely sparse character,it is catastrophic to the accuracy of recommendation because of adding large amount of noise for data with the complex nature of the association.This paper has focused on the privacy leakage of recommendation systems for non-i.i.d data.We improve a matrix factorization method with a new differential privacy perturbation mechanism which is based on the analysis of the complex correlations of non-i.i.d data,and solve the above problems.The main research works are as follows:(1)The research reviews and analyses the exsiting work of recommender systems and privacy preserving methods,and points out the utility,security and the technical challenges of the traditional recommender system model and differential privacy method under non-i.i.d data.Based on the assumation of the scenario of centralized recommendation system in non-i.i.d,we explain and analyze the problem of privacy leakage in detail caused by the inference attack of non-i.i.d data.(2)We first analyze and summarize the complex correlation of non-i.i.d data,and propose the multiple correlated differential privacy matrix factorization method within non-i.i.d context.By using the regularization theory,multiple correlations of non-i.i.d data is introduced as prior knowledge into the objective function of matrix factorization.Secondly,in order to guarantee the privacy security of the multiple correlated differential privacy matrix factorization method in non-i.i.d assumation,we devise the multiple correlated objective perturbation mechanism which based on the differential privacy Laplace mechanism.Finally,we analyze and give the theoretical proof to the privacy security of the multiple correlated differential privacy matrix factorization model,and analyze the time complexity of the algorithm.(3)According to the proposed multiple correlated differential privacy matrix factorization model and the multiple correlation objective perturbation mechanism algorithm,the recommender system is designed and implemented.We analyze the requirements of the recommended system and design the overall architecture of the system.Simultaneous,we describe the steps of each sub module of the algorithm in detail.Finally,we labor the complex of each part of recommendation system and the whole algorithm.(4)We conduct the experiment on two real datasets,Movielens and BookCrossing.By comparing the traditional matrix factorization and an improved differential privacy method of non-i.i.d data,we show that the proposed model can obtain better recommendation results in non-i.i.d data by testing different iteration numbers and privacy levels.At the same time,we predict the results of the experiment and predicted the performance analysis of two data sets with different data sparseness respectively.The experimental results show that for more sparse data sets,the results with higher precision by using the recommendation model proposed in this paper.
Keywords/Search Tags:non-i.i.d data, recommender system, matrix factorization, differential privacy
PDF Full Text Request
Related items