Research And Implementation On Data Mining Methods Based On Privacy Preserving

Posted on:2021-07-04

Degree:Master

Type:Thesis

Country:China

Candidate:Y L Yang

Full Text:PDF

GTID:2518306308470214

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The widespread application of big data and artificial intelligence technologies has enabled the great value behind data to be tapped,but it has also brought about a tricky privacy leak.Under the premise of ensuring data security,how to realize open sharing and efficient mining of big data has become an increasingly important research area.In order to cope with the risk of privacy leakage in data mining methods,this paper designs and implements two privacy preserving-data mining models for unstructured data by in-depth research on big data privacy protection technology,which can effectively achieve a balance between data security and availability.The main innovative achievements of the paper are as follows:(1)Aiming at the problem of privacy leakage in the deep learning model and the opacity of privacy protection,this paper combines differential privacy with generative models and innovatively proposes an adaptive differential privacy generative adversarial network model(Adp-GAN).Adp-GAN rationally allocates Laplacian noise to the input features of the affine transformation layer of the neural network as a discriminator and the polynomial approximate coefficients of the loss function of the output layer through the adaptive differential privacy implementation mechanism.While implementing differential privacy protection,Adp-GAN effectively reduces the consumption of privacy budget during training and improves the utility of the model.Experiments on the standard datasets MNIST and CelebA verify that Adp-GAN can generate higher quality data.In addition,members' reasoning attack experiments prove that Adp-GAN has better ability to resist attacks.(2)To address the deficiencies of traditional data masking techniques,this paper focuses on the identification of unstructured sensitive data,and constructs an adaptive data masking-named entity recognition model(Adm-NER).Based on the Bi-LSTM-CRF model,Adm-NER applies adversarial transfer learning to the field of data desensitization,which can effectively identify sensitive data in the lack of sample fields,and then combined with self-attention mechanism to assist in the positioning of word boundaries to achieve Higher recognition accuracy.The results of five comparative experiments show that Adm-NER has significantly improved the accuracy of identifying sensitive data.In addition,the transfer learning experiment from the news field to the medical field proves that Adm-NER can adaptively learn common features by using large-scale labeled samples in the news field to achieve accurate positioning and recognition of sensitive data in the medical field,which is conducive to subsequent data desensitization.Adm-NER provides a new idea for the intelligent design of big data masking systems.

Keywords/Search Tags:

Differential privacy, Generative adversarial network, Data masking, Named entity recognition

PDF Full Text Request

Related items

1	Data Privacy Masking Of Text Sequence Dataset Based On Generative Adversarial Network
2	Research On Named Entity Recognition And Relation Extraction Between Entities Based On Depth Learning
3	Research On Differential Privacy Protection Techniques For Image Data Based On Generative Adversarial Networks
4	Research On Image Privacy Protection Methods Based On Generative Adversarial Network
5	Research On Face Recognition Using Deep Learning With Privacy Protection
6	Research On Named Entity Recognition For Chinese Privacy Policy
7	Research On Named Entity Recognition Algorithm And Its Implement In Specific Fields
8	Research And Implementation Of Generative Adversarial Network Based Image Privacy Protection Algorithm
9	Research On Chinese Named Entity Recognition Based On Deep Learning
10	Biomedical Named Entity Recognition And Entity Relation Extraction Based On Deep Learning Method