Font Size: a A A

Measurement, Evaluation, and Defense against Privacy Risks to Web User

Posted on:2018-05-09Degree:Ph.DType:Dissertation
University:Indiana UniversityCandidate:Kaizer, Andrew JohnFull Text:PDF
GTID:1476390020956136Subject:Computer Science
Abstract/Summary:
The Internet has become a critical tool for individuals in all aspects of our lives: From entertainment to social networking to managing finance and health information. However, this reliance on the Internet has a dark side for privacy, as users are constantly tracked and profiled with minimal transparency or consent. The approaches to track users and extract their information come primarily in two forms: machine-based personal information (MPI)---e.g. browser version, installed plugins, etc.---and personally identifiable information (PII)---e.g. email, name, credit card details, etc. In this dissertation we measure the privacy risks associated with how first party websites expose and third parties track and profile users online using MPI and PII approaches that have not been previously explored and propose techniques that users can employ to protect their privacy. We break this dissertation into three core contributions. First, we study the differences between how first party websites treat logged-in and not-logged-in users to establish that different advertisers, profilers, and tracking firms are encountered by users. We also explore how PII is leaked and use this information to create a model to identify what PII is most valuable to third parties. Our second contribution establishes that the privacy footprint of a first party website can be calculated without providing any PII. We observe that the same PII is accessible when logged-in and not-logged-in; detail how website category, popularity, and PII co-occurrence impact what PII is requested; and compare models to create a "privacy score" that succinctly informs users when a website requests more PII than expected. Our third contribution is the development of a machine learning classifier capable of identifying third party JavaScript-based MPI tracking with 97.7% accuracy. This expands upon prior work that primarily focused on identifying specific JavaScript-based vectors: cookies, font-based, and graphic-based.
Keywords/Search Tags:PII, Privacy
Related items