Research In Microaggregation Algorithm For K-Anonymization

Posted on:2010-08-22

Degree:Master

Type:Thesis

Country:China

Candidate:T T Cen

Full Text:PDF

GTID:2178360278968331

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

The k-anonymity model has been extensively investigated for its simplicity and practicability. The model requires that each record in the anonymized table be indistinguishable with at least k-\ other records within the table with respect to a set of quasi-identifier attributes. In this case, individuals cannot be uniquely identified by adversaries, so the individuals' privacy can be preserved. Most existing k-anonymized algorithms are based on generalization and suppression techniques, which have some defects on efficiency and numerical data semantics preservation. Recently, microaggregation techniques have been introduced to implement datasets k-anonymization, which remedy some defects of generalization and suppression techniques. The idea of microaggregation is that a table is partitioned into several clusters based on some heuristic methods, which requires each cluster should contain k records at least. The records in the same cluster are as similar as possible. Then the records of each cluster are replaced by the cluster's centroid to implement k-anonymization.In this thesis, we investigate a microaggregation algorithm for global search solution, implement a microaggregation algorithm for mixed data and propose a comprehensive evaluation framework for microaggregation algorithm. The main contributions are as followed:(1) An ICSMA (Immune Clonal Selection Microaggregation Algorithm) is proposed to improve the quality of anonymized data, which improved the standard ICSA by introducing adjusting operator which can delete invalid antibody during antibody evolution to accelerate convergence speed. The experimental results show that ICSMA generates anonymity tables with less information loss and lower disclosure risk as compared with MDAV algorithm.(2) A microaggration algorithm for mixed data is proposed to solve the drawback of existing microaggregation algorithms on anonymizing the categorical data. The algorithm adoptes euclidean distance for numerical data, and adoptes weighted hierarchy distance for categorical data and then combines above distances as mixed distance for mixed data. We take mode values as the centers for categorical attributes, simultaneously, take mean values be the centers of numerical attributes. Then the record values of each cluster are replaced by above centroid to implement k-anonymization. Experiments show that the distance measurement for categorical data causes less distortion, and the improved microaggregation algorithm based on the mixed distance enjoys better clustering quality than the traditional MDAV algorithm.(3) An evaluation model for k-anonymized data oriented to microaggregation (EM4AD0M) is proposed. The model can evaluate microaggregation algorithm from the view of data utility, information loss, and the trade-off of data utility and information loss. Experimental results show that the model can evaluate the anonymity data comprehensively.

Keywords/Search Tags:

K-anonymization, Generalization/Suppression, Microdata, Microaggregation, Privacy Preservation, Immune Clonal Selection Algorithm

PDF Full Text Request

Related items

1	Research On Microdata Anonymity Algorithms For Privacy-Preservation Data Publishing
2	Research On Personalized Privacy Preservation Method For Microdata Publishing
3	Research On Anonymity Technology For Microdata Publishing
4	Checking And Preventing Privacy Inference Attacks Based On K-Anonymized Microdata
5	Improved Immune Clone Selection Algorithms And Its Applications
6	Clonal Selection Algorithm And Its Application In High-Dimensional Global Function Optimization
7	Difference Variation Of Clonal Selection Algorithm And Its Application
8	Immune Clonal Strategy Algorithms And Their Application
9	Study And Application Based On Adaptive Immune Clonal Selection Algorithm
10	Research On Anonymization Technique Based Privacy Preserving Method On Facial Image