Frequent itemset mining is a classical technique used in mining association rules from data sets.In recent years,it has been broadly investigated that how to protect data privacy in frequent itemset mining.As the de facto standard for data privacy protection in the research community,the differential privacy model with strict mathematical definition ensures that attackers cannot determine the accurate information of users.Central differential private frequent itemset mining has been widely studied.However,when there is no trusted third-party collector,the existing technology cannot give commitment to data privacy and mining accuracy.Therefore,researchers introduce the local differential privacy model in which data perturbation is carried out on the client side and aggregation and reconstruction is conducted on the server.However,these technologies are not applicable to the decentralized frequent itemset mining scenarios.When the client implements local differential privacy protection for multiple items,it will regard the association between items as a new dimension for protection.As the number of items increases,the data domain to be protected increases exponentially,and the variance introduced by data perturbation also increases,ultimately affecting the accuracy of server aggregation results.In this thesis,a high confidence frequent itemset mining method under local difference privacy is proposed.This thesis uses matrix form to represent the itemset owned by users.(1)In this thesis,the client will generate a nondiagonal matrix through calculation.How to protect the matrix under local differential privacy and achieve the fastness and accuracy of matrix fitting is the biggest challenge.According to the nature of local differential privacy,if privacy protection is carried out for each element,the privacy budget will be equally split to each element,which increases the communication overheads and introduces greater variance,affecting the server’s estimation of users’ item matrix.According to the nature that the uppermost singular value in the result of SVD contains the most information of the matrix,in this thesis,only the element at the upper left corner of the matrix is selected for data perturbation.The differential privacy for only one element can greatly reduce the communication times of the client.Meanwhile,the privacy budget is fully utilized to make the perturbation variance smaller and the server can obtain more accurate estimation results.(2)On the basis of the above theoretical research,this thesis conducts a lot of comparative experiments with the existing research work,designs and develops a local differential private frequent itemset mining toolset,and demonstrates its application in the home Internet of Things scenario.In this thesis,frequent itemset mining under local differential privacy is studied deeply by adapting matrix decomposition method.Compared with the existing research methods,this thesis uses SVD to achieves better trade-off of accuracy and communication overhead,with strict privacy guarantee,results in more suitable frequent itemset for the follow-up association rules mining.This implies great significance for association rules mining research in local differential privacy. |