Font Size: a A A

Managing Quality, Identity and Adversaries in Public Discourse with Machine Learning

Posted on:2013-03-28Degree:Ph.DType:Dissertation
University:Drexel UniversityCandidate:Brennan, MichaelFull Text:PDF
GTID:1458390008966941Subject:Computer Science
Abstract/Summary:
Automation can mitigate issues when scaling and managing quality and identity in public discourse on the web. Discourse needs to be curated and filtered. Anonymous speech has to be supported while handling adversaries. Reliance on human curators or analysts does not scale and content can be missed. These scaling and management issues include the limits of crowdsourced comment rating systems, flaws in crowdsourced topic identification systems, and the identification or anonymization of large numbers of authors. Scaling rating and topic identification systems results in missing relevant quality discourse. Authorship recognition without automation is time consuming, costly, and does not scale. Using machine learning to automate authorship recognition gives rise to serious privacy and anonymity concerns, and deception in writing style can be difficult to detect. This work replicates comment ratings and topic identification of crowdsourcing systems that currently rely on human participation to be effective. This work also demonstrates the ability of machine learning to identify authors quickly and accurately, as well as methods to circumvent this identification. Finally, this work presents novel data sets as a foundation for future research in comment classification and adversarial authorship recognition.
Keywords/Search Tags:Discourse, Quality, Authorship recognition, Machine
Related items