Font Size: a A A

Research On Privacy Protection Method For Multi-type Data Publishing

Posted on:2021-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:S X WuFull Text:PDF
GTID:2518306032467054Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In data release,data mining and analysis technology can adequately excavate the value of data,promoting the prosperity of related industries.However,maliciously mining data seriously threatens users' private information.Therefore,the privacy protection data release technology that achieves data security and effectiveness has developed rapidly,and corresponding research results have been achieved.However,the current research direction is mostly in the related fields of general protection needs,and the data release field for special protection needs still needs attention.Therefore,this thesis conducts research and discussion in the field of combination of special attack types and special-diversity needs and the field of incomplete data release.The work of this thesis is as follows:(1)In the field of complete data publishing,in view of the defect that the traditional cross-barrel generalization method cannot respond to sensitive attacks,a cross-barrel generalization method based on sensitive attribute classification(k,1,?i)-CBG((k,1,ai)-Cross-Bucket Generalization)method.Firstly,the calculation algorithm of the sensitive value set table cal_SV_S_T algorithm is proposed.When selecting the sensitive value set,the proportion of sensitive values with high sensitivity level is limited to ensure that the tuples in the tuple bucket(a collection of tuples)meet the threshold of the level.limit.Then the generalize_T algorithm is proposed.When dividing the equivalence class,the tuples with more similarity are selected according to the distance between the tuples.Experiments show that this method has advantages in efficiency and information validity while effectively responding to sensitive attacks.(2)In the field of missing data release,AIDRL(Anonymity for Incomplete Data based on Reconstruction and 1-diversity)protection is proposed for the problem of excessive loss of reconstruction information.method.Firstly,an IDR(Incomplete Data Reconstruct)algorithm based on mondrian partition is proposed.The complete data is used to simulate an approximate proportion of missing data,and the remaining complete data is divided into reconstructions to simulate missing data one by one until the correct rate of the reconstructed missing data When the threshold is reached,the divided parameters are used to reconstruct the original missing data.Then proposed the Equivalence Class Build based on 1-diversity(ECBL)algorithm based on 1 diversity,and selected tuples to construct equivalence classes according to the current remaining top 1 sensitive values.The calculation method of the time and equivalence classes facilitates the rapid aggregation of missing tuples and reduces the loss of information.Experiments show that this method can better ensure the validity of information when the demand for sensitive value protection is high.(3)In the field of missing data publishing,the ASIDL(Anatomy with Slicing for Incomplete Data based on 1-diversity)method based on decomposition and slicing is proposed for the problem of reconstruction errors that are almost unavoidable by the missing data reconstruction method..Firstly,the Tuple Bucket Build based on 1-diversity(TBBL)algorithm is used to cluster the QI attributes,and the first tuples with different sensitive values are randomly selected to form buckets.Tuple protects its identity.Then,the Tuple Bucket Split based on 1-diversity(TBSL)algorithm is proposed,which divides the existing tuple bucket under the premise of satisfying 1 diversity,and realizes the smaller bucket.Experiments show that this method has advantages in both information loss and efficiency.
Keywords/Search Tags:Privacy preserving, Sensitive attacks, Cross-bucket generalization, Incomplete data, Slicing
PDF Full Text Request
Related items