Font Size: a A A

Spam, Fraud, and Bots: Improving the Integrity of Online Social Media Dat

Posted on:2018-10-10Degree:Ph.DType:Dissertation
University:The University of New MexicoCandidate:Minnich, Amanda JeanFull Text:PDF
GTID:1478390020456285Subject:Computer Science
Abstract/Summary:
Online data contains a wealth of information, but as with most user-generated content, it is full of noise, fraud, and automated behavior. The prevalence of "junk" and fraudulent text affects users, businesses, and researchers alike. To make matters worse, there is a lack of ground truth data for these types of text, and the appearance of the text is constantly changing as fraudsters adapt to pressures from hosting sites. The goal of my dissertation is therefore to extract high-quality content from and identify fraudulent and automated behavior in large, complex social media datasets in the absence of ground truth data. Specifically, in my dissertation I design a collection of data inspection, filtering, fusion, mining, and exploration algorithms to: automate data cleaning to produce usable data for mining algorithms, quantify the trustworthiness of business behavior in online e-commerce sites, and efficiently identify automated accounts in large and constantly changing social networks. The main components of this work include: noise removal, data fusion, multi-source feature generation, network exploration, and anomaly detection.
Keywords/Search Tags:Data, Social
Related items