Feature Selection Based On Differential Privacy

Posted on:2016-09-02

Degree:Master

Type:Thesis

Country:China

Candidate:J Yang

Full Text:PDF

GTID:2348330473965466

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Feature selection is one of key problems in pattern recognition, machine learning and data mining. It is a process to select some effective features from a group of features to reduce the dimensionality of the feature space. Feature selection not only can reduce the dimensionality, but also can be utilized as knowledge discovery tool to find the true variables of natural model. In addition, privacy preserving is a hot topic in data mining. In the process of knowledge discovery, how to protect the privacy of personal information has become an issue of current scholars. But, current privacy preserving data mining researches mainly focus on privacy preserving classification and regression, and have few works on privacy preserving feature selection.This thesis studies feature selection of protecting data privacy mainly based on differential privacy. For feature weighting algorithm based on local learning, Output Perturbation and Objective Perturbation strategies are respectively used to improve the privacy preserving performance of feature selection algorithm. And we analyze the correctness of algorithm and perform experiments to verify the effectiveness of algorithm. The experimental results on some real data sets show that in the same condition(data sets, experimental parameters, classifiers, et al.), the algorithm of privacy preserving feature selection based on Objective Perturbation has a better privacy preserving performance than the algorithm of privacy preserving feature selection based on Output Perturbation.In addition, we study two types of privacy preserving ensemble feature selection methods based on Output Perturbation strategy. Combined with different classification algorithms (nearest neighbor and support vector machine), the performance of ensemble feature selection based on differential privacy is verified on some real data sets. The experimental results show that in the same condition(data sets, experimental parameters, classifiers, et al.), the algorithm of adding ensemble learning after privacy preserving has a better privacy preserving performance than the algorithm of adding privacy preserving after ensemble learning.

Keywords/Search Tags:

Feature Selection, Privacy Preserving, Ensemble, Output Perturbation, Objective Perturbation

PDF Full Text Request

Related items

1	A Study Of Privacy-Preserving Data Mining Based On Multiplicative Perturbation
2	Research On Social Network Privacy Preserving Method Based On Perturbation Matrix
3	Privacy Preserving Association Rule Mining
4	Research On Data Mining Privacy Preserving Method Based On Random Perturbation
5	Approaches For Location Privacy Preserving Based On Feature Security
6	Research On Data Perturbation Privacy Preserving Method For Distributed Clustering
7	Research On Privacy-Preservation Technique Based On Random Projection Data Perturbation
8	Research Of Privacy Preserving Data Mining Based On Perturbation
9	Privacy Preserving For Network Traffic Data Based On Perturbation
10	Research On Multi-parameters Perturbation Privacy Preserving Association Rules Mining Algorithm