Using entropy as a measure of privacy loss in statistical databases

Posted on:2005-07-30

Degree:M.S

Type:Thesis

University:The University of Texas at El Paso

Candidate:Chirayath, Vinod

Full Text:PDF

GTID:2458390008491460

Subject:Computer Science

Abstract/Summary:

Although the Internet is a vast source of information for individuals it is also a major source of information about individuals. Data collection through surveys, registration pages, user forms have resulted in more personal information being available than before. Organizations like the Census Bureau, insurance companies, hospitals, universities keep databases that contain valuable information, increasing concerns about an individual's privacy. The problem of privacy in such databases is to protect information specific to an individual while releasing aggregate data for research purposes. Although there are several approaches to privacy preservation, definitions of privacy loss are either missing or tailored to each approach. In order to compare different approaches, we need a definition of privacy that not only determines if privacy loss occurs, but also measures the amount of privacy loss.; In this thesis, we propose a definition of privacy based on the concept of entropy used in information theory. Entropy is a way to measure the amount of information in a signal based on probabilities. We use this notion of information and consider the entropy of records in a database. The amount of privacy loss caused by a statistical release is defined as the difference in entropy of a record before and after the statistical release. We argue that this notion of privacy loss is intuitive. In particular, we consider the amount of privacy loss after releasing the average in a randomly generated one-dimensional database, as the size of the database increases.

Keywords/Search Tags:

Privacy loss, Database, Information, Entropy, Statistical

Related items

1	Graph Privacy Computation And Quantization Method Based On Structural Entropy
2	Research Of Privacy Homomorphism Based On Manipulation Of Encrypted Database Information
3	Research And Implementation Of Data Anonymized Privacy Protection Method
4	Study On Attribute Reduction Criteria And Information Loss Of Attribute Reduction Based On Rough Sets
5	Research And Application Of Rational Privacy Measurement
6	The Research Of Privacy Protection Methodologies On The Trusted Database
7	Research On Personal Sensitive Data Privacy Protection
8	Research On Location Privacy Protection Method Based On Information Entropy And Caching Technology
9	Research On K-Anonymity For Privacy Preserving
10	Research On K-anonymous Algorithm Based On The Sensitive Degree Of Privacy