Font Size: a A A

Skyline Query Method Satisfying Differential Privacy Under Diverse Data

Posted on:2022-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y FangFull Text:PDF
GTID:2518306614459044Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
With the development of the era of big data,Skyline query technology is favore d in querying information,and differential privacy technology has become a research hotspot in recent years in protecting data privacy.Data also shows diversity.For exa mple,in multidimensional data,the data may be missing a certain dimension,which i s called incomplete data.If the existence of data depends on a certain probability,the data has uncertainty,which is called uncertain data.This thesis studies the Skyline qu ery method that satisfies differential privacy under diverse data.First,the problem of Skyline query that satisfies differential privacy under multi-dimensional incomplete data is studied,and a Skyline query algorithm that satisfies differential privacy under multi-dimensional incomplete data is proposed.The method consists of four parts.The classification algorithm based on the weighted decision tree proposed in the first part puts forward the concept of weighted decision tree,and uses the weighted decision tree to classify incomplete data sets;The pruning algorithm based on attribute reduction proposed in the second part prunes tuples with missing all important attributes,which improves the efficiency of subsequent operations;In the third part,a multi-dimensional incomplete data skyline query algorithm is proposed,and the concept of optimal shadow skyline points is proposed for the first time,which improves the query efficiency of the algorithm;The fourth part is to use differential privacy technology to add noise to the query results when the data privacy protection problem is not considered in the existing incomplete data skyline query and the existing privacy protection technology has limitations.In order to balance the privacy and availability of data,a differential privacy budget allocation algorithm is proposed.The algorithm uses the idea of greedy algorithm to group buckets,and then assigns privacy budget parameters to each group according to Taylor expansion series method and weight ratio k.Calculate the Laplacian noise that should be added for each group.The proposed algorithm introduces differential privacy into the skyline of incomplete data for the first time,and adds corresponding noise to the query results,which better protects the privacy of the data.We further studied the Skyline query problem that satisfies differential privacy under uncertain data,and proposed a Skyline query algorithm that satisfies differential privacy under uncertain data.The algorithm is divided into three parts.In the first part,a probabilistic frequent itemset mining algorithm for uncertain data is proposed.The algorithm uses the probabilistic one-way frequent pattern tree PUFP-tree to mine probabilistic frequent itemsets.The second part proposes a classification algorithm based on association rules.The algorithm uses probabilistic frequent itemsets to mine positive correlation rules,and uses the positive correlation rules to classify uncertain data.The third part prunes the classified uncertain data,performs a skyline query on the pruned data,and finally adds noise to the query result.
Keywords/Search Tags:Skyline query, incomplete data, uncertain data, differential privacy
PDF Full Text Request
Related items