Research Of Classification Based On Anonymized Data

Posted on:2012-09-21

Degree:Master

Type:Thesis

Country:China

Candidate:R C Zhang

Full Text:PDF

GTID:2178330338992288

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of information technology, especially the development of database technology, the collection, management and analysis of the massive data become convenient. Various data mining techniques including the classification played a very active role in a number of deep-level applications. But at the same time, it has also brought many problems in terms of privacy protection. Data mining makes great benefits. Meanwhile, once private information disclosed can bring great harm to people, because the data for data mining contain a number of personal privacy information . If the information is given to data miners, it's inevitable to disclose privacy information. With the field of data mining being used deeply, it's a focus that privacy information is disclosed more and more seriously. For these reasons, how to implement a data mining under privacy protection becomes a hot focus in research of data mining.Classification is an active research field in data mining. Many different techniques have been proposed for classification: decision tree classification, the nearest neighbor classification, Neural network classification, support vector machine classification and bayes classification.However, these algorithms are based on the original data, and they could disclose private information easily. With the depth study of uncertain data, uncertain data mining has become a hot topic in data mining. It is a trend that the traditional classification has been extended to the field of uncertain data.This project focuses on the classification based on anonymous data, model anonymized data as uncertain data by k-anonymity. We propose a new approach for building classifiers using anonymized data. In the method, we do not assume the probability distribution of any data. Instead, we propose collecting all necessary statistics during anonymization and releasing these together with the anonymized data as new attributes. This new attribute consists of expected value and variance for numerical quasi-identifiers and probability mass function for categorical quasi-identifiers. Then, it can calculate expected values of kernel functions or square distances easily. Finally, we use the classifier to classify over anonymized data.This paper proposes a kind of improved method of building a classifier using anonymized data----KCNN-SVM. In this method, we achieve using anonymized data for classification, improve the classification algorithm and improve the classification efficiency.

Keywords/Search Tags:

privacy preserve, classification for anonymized data, KCNN-SVM

PDF Full Text Request

Related items

1	Research And Implementation Of Data Anonymized Privacy Protection Method
2	Methods for evaluating the privacy of anonymized network data
3	Checking And Preventing Privacy Inference Attacks Based On K-Anonymized Microdata
4	A Dynamic Privacy Preserving Method On Protecting Trajectory Data
5	Research On Anonymized Privacy Preserving Publishing Of Data Streams
6	The Research On Privacy-preserving Data Publishing For Data Classification Analysis
7	Research And Application Of Privacy-Preserving Query On Multidimensional Data For Cloud Storage
8	Research On Privacy-Preserve In Location System
9	Research On Privacy Preserving Classification Algorithm For Horizontal Distribution Data
10	Research On Privacy-Preserving Secure Search Technologies In Cloud Computing