Font Size: a A A

Feature Selection Based On Differential Privacy

Posted on:2016-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:J YangFull Text:PDF
GTID:2348330473965466Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Feature selection is one of key problems in pattern recognition, machine learning and data mining. It is a process to select some effective features from a group of features to reduce the dimensionality of the feature space. Feature selection not only can reduce the dimensionality, but also can be utilized as knowledge discovery tool to find the true variables of natural model. In addition, privacy preserving is a hot topic in data mining. In the process of knowledge discovery, how to protect the privacy of personal information has become an issue of current scholars. But, current privacy preserving data mining researches mainly focus on privacy preserving classification and regression, and have few works on privacy preserving feature selection.This thesis studies feature selection of protecting data privacy mainly based on differential privacy. For feature weighting algorithm based on local learning, Output Perturbation and Objective Perturbation strategies are respectively used to improve the privacy preserving performance of feature selection algorithm. And we analyze the correctness of algorithm and perform experiments to verify the effectiveness of algorithm. The experimental results on some real data sets show that in the same condition(data sets, experimental parameters, classifiers, et al.), the algorithm of privacy preserving feature selection based on Objective Perturbation has a better privacy preserving performance than the algorithm of privacy preserving feature selection based on Output Perturbation.In addition, we study two types of privacy preserving ensemble feature selection methods based on Output Perturbation strategy. Combined with different classification algorithms (nearest neighbor and support vector machine), the performance of ensemble feature selection based on differential privacy is verified on some real data sets. The experimental results show that in the same condition(data sets, experimental parameters, classifiers, et al.), the algorithm of adding ensemble learning after privacy preserving has a better privacy preserving performance than the algorithm of adding privacy preserving after ensemble learning.
Keywords/Search Tags:Feature Selection, Privacy Preserving, Ensemble, Output Perturbation, Objective Perturbation
PDF Full Text Request
Related items